Bug 220115

Summary: [GTK][WPE] UserInteractive threads are not Real-time in Linux
Product: WebKit Reporter: Philippe Normand <pnormand>
Component: PlatformAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, benjamin, bugs-noreply, cdumez, cgarcia, cmarcelo, ews-watchlist, gyuyoung.kim, jasemabeed114, mcatanzaro, pgriffis, ryuan.choi, sergio, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=220150
Attachments:
Description Flags
Patch
none
Patch
ews-feeder: commit-queue-
WIP patch
none
WIP patch
none
WIP patch
none
Add Realtime portal support
none
Add Realtime portal support
none
Add Realtime portal support
none
Add Realtime portal support
none
Patch
ews-feeder: commit-queue-
Patch
none
Patch
none
[PATCH] Update realtime portal usage
none
Patch mcatanzaro: review+

Description Philippe Normand 2020-12-23 05:28:23 PST
Currently UserInteractive threads created by the AudioWorklet seem to be RT only on Darwin which has HAVE(QOS_CLASSES) enabled.
Comment 1 Philippe Normand 2021-02-01 08:55:14 PST
Created attachment 418884 [details]
Patch
Comment 2 Philippe Normand 2021-02-01 08:56:37 PST
Just an early WIP experiment, doesn't work though:

RTPRIO soft limit is 0, hard is 0
Setting policy failed: Operation not permitted.

I guess it's related with flatpak and/or selinux.
Comment 3 Philippe Normand 2021-05-30 13:03:09 PDT
Maybe we can use the rtkit (dbus) API: http://git.0pointer.net/rtkit.git/tree/README
Comment 4 Philippe Normand 2021-06-06 09:39:02 PDT
I'm not sure how I'm supposed to use rtkit from a sandboxed process though... I would need the host PID of the WebProcess, within the sandbox... Otherwise rtkit won't find the infos of the PID to modify in the host /proc...
Comment 5 Philippe Normand 2021-06-06 10:05:16 PDT
https://github.com/flatpak/xdg-desktop-portal/issues/355
Comment 6 Philippe Normand 2021-06-10 12:06:13 PDT
I started a branch, but without a portal for rtkit it's useless. Next task would be to work on such portal I suppose.
Comment 7 Philippe Normand 2021-06-22 06:11:31 PDT
Created attachment 431959 [details]
Patch
Comment 8 Carlos Garcia Campos 2021-06-23 00:47:45 PDT
Why do we need to set the limits? isn't it enough if getting the limits with sched_get_priority_max?
Comment 9 Philippe Normand 2021-06-23 01:01:51 PDT
(In reply to Carlos Garcia Campos from comment #8)
> Why do we need to set the limits? isn't it enough if getting the limits with
> sched_get_priority_max?

That's what the first patch was doing, and failing, because the process needs special permissions to use this API.

I think that's why rtkit was written in the first place... To allow unprivileged processes to gain RT ability.
Comment 10 Carlos Garcia Campos 2021-06-23 01:04:25 PDT
(In reply to Philippe Normand from comment #9)
> (In reply to Carlos Garcia Campos from comment #8)
> > Why do we need to set the limits? isn't it enough if getting the limits with
> > sched_get_priority_max?
> 
> That's what the first patch was doing, and failing, because the process
> needs special permissions to use this API.
> 
> I think that's why rtkit was written in the first place... To allow
> unprivileged processes to gain RT ability.

That's because of the sandbox or because of the process effective user?
Comment 11 Philippe Normand 2021-06-23 01:16:59 PDT
(In reply to Carlos Garcia Campos from comment #10)
> (In reply to Philippe Normand from comment #9)
> > (In reply to Carlos Garcia Campos from comment #8)
> > > Why do we need to set the limits? isn't it enough if getting the limits with
> > > sched_get_priority_max?
> > 
> > That's what the first patch was doing, and failing, because the process
> > needs special permissions to use this API.
> > 
> > I think that's why rtkit was written in the first place... To allow
> > unprivileged processes to gain RT ability.
> 
> That's because of the sandbox or because of the process effective user?

Sandbox, at least. The first patch was failing here (see comment 2).
Comment 12 Carlos Garcia Campos 2021-06-25 06:09:07 PDT
Created attachment 432258 [details]
WIP patch

This works when sandbox is disabled, we would need a rtkit portal to make this work unser the sandbox.
Comment 13 Philippe Normand 2021-06-25 06:21:22 PDT
Comment on attachment 432258 [details]
WIP patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432258&action=review

> Source/WTF/wtf/glib/RealTimeKit.cpp:2
> + * Copyright (C) 2020 Jan-Michael Brummer <jan.brummer@tabos.org>

Is this accurate?
Comment 14 Carlos Garcia Campos 2021-06-25 06:26:52 PDT
Some observations:

 - The patch doesn't allow to make the main thread real time. This is only used by the web process to match UI process (in mac), but our UI process is not real time.

 - We might consider to use nice value to make QOS::UserInitiated threads have a higher priority than QOS::Default. That requires to use rtkit too.

 - Thread::create() uses QOS::UserInitiated by default not QOS::Default.

 - WorkQueue::create() uses QOS::Default by default.

 - With a quick grep I can see these threads in every qos

QOS::UserInteractive: webaudio AudioWorkletThread (when isAudioContextRealTime is true), com.apple.WebKit.EventDispatcher, async scrolling thread and some others we don't currently use.

QOS::UserInitiated: org.webkit.DataURLDecoder, network process main thread and all Thread::create() calls not passing a qos value (I guess most of them).

QOS::Default: webaudio AudioWorkletThread (when isAudioContextRealTime is false), webaudio offline renderer, network process cache.

QOS::Utility: org.webkit.BlobUtility, WebResourceLoadStatisticsStore Process Data Queue

QOS::Background: network process cache.
Comment 15 Carlos Garcia Campos 2021-06-25 06:27:19 PDT
(In reply to Philippe Normand from comment #13)
> Comment on attachment 432258 [details]
> WIP patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=432258&action=review
> 
> > Source/WTF/wtf/glib/RealTimeKit.cpp:2
> > + * Copyright (C) 2020 Jan-Michael Brummer <jan.brummer@tabos.org>
> 
> Is this accurate?

No I copy pasted the header license from other file :-P
Comment 16 Carlos Garcia Campos 2021-06-25 06:28:19 PDT
Created attachment 432259 [details]
WIP patch
Comment 18 Michael Catanzaro 2021-06-25 06:40:03 PDT
Comment on attachment 432259 [details]
WIP patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432259&action=review

Honestly it's hard to see why it's useful to add a feature that only works when the web process sandbox is disabled, especially since we want to remove the ability to disable the sandbox in the GTK 4 API.

You're so close to making this actually work for real. What you currently have in RealTimeKit.cpp is pretty close to the entire portal implementation you'd need in xdg-desktop-portal. It would just need some extra code to translate pids from the process's pid namespace to the host pid namespace. Then you'd adjust the D-Bus calls to use the portal instead.

> Source/WTF/wtf/Threading.cpp:351
> +    // We don't allow to make the main thread real time.
> +    if (isMainThread())
> +        return;

The comment should say why.
Comment 19 Carlos Garcia Campos 2021-06-25 07:28:37 PDT
(In reply to Michael Catanzaro from comment #18)
> Comment on attachment 432259 [details]
> WIP patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=432259&action=review
> 
> Honestly it's hard to see why it's useful to add a feature that only works
> when the web process sandbox is disabled,

Because you can disable it if you want to get the features not working under the sandbox, like this one. And because this is a necessary first step required to make it work under the sandbox too.

> especially since we want to remove
> the ability to disable the sandbox in the GTK 4 API.

I don't, unless there aren't regressions. We already regressed several features introducing the sandbox, let's not make them mandatory at least.

> You're so close to making this actually work for real.

It works for real, not everybody is enabling the sandbox in every single app. Or geolocation, power notification and printing don't work for real either? Should we remove those because they don't work for people/apps enabling the sandbox?

> What you currently
> have in RealTimeKit.cpp is pretty close to the entire portal implementation
> you'd need in xdg-desktop-portal. It would just need some extra code to
> translate pids from the process's pid namespace to the host pid namespace.
> Then you'd adjust the D-Bus calls to use the portal instead.
> 
> > Source/WTF/wtf/Threading.cpp:351
> > +    // We don't allow to make the main thread real time.
> > +    if (isMainThread())
> > +        return;
> 
> The comment should say why.
Comment 20 Michael Catanzaro 2021-06-25 07:51:13 PDT
> It works for real, not everybody is enabling the sandbox in every single
> app. Or geolocation, power notification and printing don't work for real
> either? Should we remove those because they don't work for people/apps
> enabling the sandbox?

It's extremely inadvisable to run WebKit with the sandbox disabled. We should aim to fix all sandboxing regressions rather than allow disabling the sandbox. Geolocation should be easy because there is a location portal already. Battery status is a little harder because there's no portal for it, but we just have to write one, same as here. Printing might be tougher because we need to rewrite WebKitPrintOperation.
Comment 21 Carlos Garcia Campos 2021-06-25 07:52:08 PDT
There's another problem of this approach, rtkit limits the amount of threads that can be made real time for a user and that's very easy to get with a few tabs open:

rtkit-daemon[934]: Warning: Reached maximum concurrent process limit for user '1001', denying request.
Comment 22 Carlos Garcia Campos 2021-06-25 07:54:28 PDT
(In reply to Michael Catanzaro from comment #20)
> > It works for real, not everybody is enabling the sandbox in every single
> > app. Or geolocation, power notification and printing don't work for real
> > either? Should we remove those because they don't work for people/apps
> > enabling the sandbox?
> 
> It's extremely inadvisable to run WebKit with the sandbox disabled. We
> should aim to fix all sandboxing regressions rather than allow disabling the
> sandbox. Geolocation should be easy because there is a location portal
> already. Battery status is a little harder because there's no portal for it,
> but we just have to write one, same as here. Printing might be tougher
> because we need to rewrite WebKitPrintOperation.

And a11y. Sure, let's fix the regressions, but I won't block new features that don't work under the sandbox for the same reason we introduced the sandbox with so many known regressions.
Comment 23 Michael Catanzaro 2021-06-25 11:24:37 PDT
(In reply to Carlos Garcia Campos from comment #21)
> There's another problem of this approach, rtkit limits the amount of threads
> that can be made real time for a user and that's very easy to get with a few
> tabs open:
> 
> rtkit-daemon[934]: Warning: Reached maximum concurrent process limit for
> user '1001', denying request.

Hm, is it maybe possible to change the thread back to non-rt status when the page is not visible? I'm sure that's harder, but each app should only have one foreground page at a time, right? That should help.

(If it's any consolation, Epiphany web processes currently start crashing on startup after something like 80 web processes are opened, which is the point where it runs out of inotify handles. So I won't lose much sleep over failure to make the thread realtime. :)
Comment 24 Carlos Garcia Campos 2021-06-26 00:31:12 PDT
(In reply to Michael Catanzaro from comment #23)
> (In reply to Carlos Garcia Campos from comment #21)
> > There's another problem of this approach, rtkit limits the amount of threads
> > that can be made real time for a user and that's very easy to get with a few
> > tabs open:
> > 
> > rtkit-daemon[934]: Warning: Reached maximum concurrent process limit for
> > user '1001', denying request.
> 
> Hm, is it maybe possible to change the thread back to non-rt status when the
> page is not visible? I'm sure that's harder, but each app should only have
> one foreground page at a time, right? That should help.

Indeed, I thought about that. That means we would need to handle the real time threads from the WebKit layer somehow, because WTF doesn't know if a particular thread blongs to a page visible or hidden. I think we could add a ThreadGroup for real time threads, when a thread is created or promoted to real time it's added to the group. Then we would need to monitor the group from the web process to actually promote the threads in the group to real time when page is visible and demote when hidden. Problem here is when we have more than one page in the same web process. If we end up handling real time threads from the wk layer, maybe we could forward the requests to the UI process to escape the sandbox too.

Another thing I've realized is that we should probably need to handle SIGXCPU signal, to demote real time threads and avoid crashes.

> (If it's any consolation, Epiphany web processes currently start crashing on
> startup after something like 80 web processes are opened, which is the point
> where it runs out of inotify handles. So I won't lose much sleep over
> failure to make the thread realtime. :)

wow 80 processes are a lot in any case. RealtimeKit allows 15 processes per user and 25 threads per user.
Comment 25 Carlos Garcia Campos 2021-06-26 00:42:54 PDT
(In reply to Carlos Garcia Campos from comment #24)
> (In reply to Michael Catanzaro from comment #23)
> > (In reply to Carlos Garcia Campos from comment #21)
> > > There's another problem of this approach, rtkit limits the amount of threads
> > > that can be made real time for a user and that's very easy to get with a few
> > > tabs open:
> > > 
> > > rtkit-daemon[934]: Warning: Reached maximum concurrent process limit for
> > > user '1001', denying request.
> > 
> > Hm, is it maybe possible to change the thread back to non-rt status when the
> > page is not visible? I'm sure that's harder, but each app should only have
> > one foreground page at a time, right? That should help.
> 
> Indeed, I thought about that. That means we would need to handle the real
> time threads from the WebKit layer somehow, because WTF doesn't know if a
> particular thread blongs to a page visible or hidden. I think we could add a
> ThreadGroup for real time threads, when a thread is created or promoted to
> real time it's added to the group. Then we would need to monitor the group
> from the web process to actually promote the threads in the group to real
> time when page is visible and demote when hidden. Problem here is when we
> have more than one page in the same web process.

This shouldn't be a problem, the scrolling thread and EventDispatcher are both singletons.

> If we end up handling real
> time threads from the wk layer, maybe we could forward the requests to the
> UI process to escape the sandbox too.
> 
> Another thing I've realized is that we should probably need to handle
> SIGXCPU signal, to demote real time threads and avoid crashes.
> 
> > (If it's any consolation, Epiphany web processes currently start crashing on
> > startup after something like 80 web processes are opened, which is the point
> > where it runs out of inotify handles. So I won't lose much sleep over
> > failure to make the thread realtime. :)
> 
> wow 80 processes are a lot in any case. RealtimeKit allows 15 processes per
> user and 25 threads per user.
Comment 26 Carlos Garcia Campos 2021-06-26 00:46:32 PDT
And I think we can forget about implementing UserInitiated threads using nice value, because they are a lot more and we will reach the rtkit limit easily.
Comment 27 Michael Catanzaro 2021-06-26 06:01:20 PDT
(In reply to Carlos Garcia Campos from comment #24)
> If we end up handling real
> time threads from the wk layer, maybe we could forward the requests to the
> UI process to escape the sandbox too.

It would only work for the bubblewrap sandbox. Under flatpak, the UI process is sandboxed too, so it would still fail unless it uses a portal.

> wow 80 processes are a lot in any case. RealtimeKit allows 15 processes per
> user and 25 threads per user.

Firefox users expect to be able to open hundreds of browser tabs. They probably don't expect hundreds of browser tabs to all get the realtime scheduler, though. So if we can make it just the visible tab, that would be amazing. Maybe also tabs that are playing audio.
Comment 28 Michael Catanzaro 2021-06-26 06:09:39 PDT
(In reply to Carlos Garcia Campos from comment #26)
> And I think we can forget about implementing UserInitiated threads using
> nice value, because they are a lot more and we will reach the rtkit limit
> easily.

The nice value is honestly pretty useless. I guess it's better than nothing, but I doubt it would have much impact in practice.

Whereas the realtime scheduler is a really big hammer that allows you to always preempt all non-realtime processes on the system. I'm actually not sure if it's safe to use for web content without some way to ensure that a JavaScript <script>while(1);</script> loop won't hang the entire system. The risk is that the realtime web process would always preempt the UI process, preventing the UI process from activating the unresponsive web process killing. Maybe the UI process main thread needs to be realtime too?

I didn't know about SIGXCPU.
Comment 29 Carlos Garcia Campos 2021-06-27 01:45:26 PDT
(In reply to Michael Catanzaro from comment #27)
> (In reply to Carlos Garcia Campos from comment #24)
> > If we end up handling real
> > time threads from the wk layer, maybe we could forward the requests to the
> > UI process to escape the sandbox too.
> 
> It would only work for the bubblewrap sandbox. Under flatpak, the UI process
> is sandboxed too, so it would still fail unless it uses a portal.

Then it's better to handle it in the web process and wait for the portal to make it work under the sandbox.

> > wow 80 processes are a lot in any case. RealtimeKit allows 15 processes per
> > user and 25 threads per user.
> 
> Firefox users expect to be able to open hundreds of browser tabs. They
> probably don't expect hundreds of browser tabs to all get the realtime
> scheduler, though. So if we can make it just the visible tab, that would be
> amazing. Maybe also tabs that are playing audio.

I guess firefox demotes to non real time threads of non visible tabs, or maybe they only use it for audio thread, I don't know what firefox does.
Comment 30 Carlos Garcia Campos 2021-06-27 01:59:01 PDT
(In reply to Michael Catanzaro from comment #28)
> (In reply to Carlos Garcia Campos from comment #26)
> > And I think we can forget about implementing UserInitiated threads using
> > nice value, because they are a lot more and we will reach the rtkit limit
> > easily.
> 
> The nice value is honestly pretty useless. I guess it's better than nothing,
> but I doubt it would have much impact in practice.

I don't know.

> Whereas the realtime scheduler is a really big hammer that allows you to
> always preempt all non-realtime processes on the system. I'm actually not
> sure if it's safe to use for web content without some way to ensure that a
> JavaScript <script>while(1);</script> loop won't hang the entire system.

JavaScript runs in the main thread, that we don't allow to make real time. Only even handler, async scrolling and webaudio threads are real time.

> The
> risk is that the realtime web process would always preempt the UI process,
> preventing the UI process from activating the unresponsive web process
> killing. Maybe the UI process main thread needs to be realtime too?
> 
> I didn't know about SIGXCPU.

To prevent those situation of blocking the entire system, rtkit requires you to set a RLIMIT_RTTIME, which is the amount of CPU time that a real time process can consume without making a blocking system call. SIGXCPU signal is sent when the soft limit is reached, if not handled or if the process keeps consuming CPU time, a new SIGXCPU is called every second until the hard limit is reached and then SIGKILL is sent.

Setting this limit is a requirement to use rtkit, but rtkit also imposes a maximum value you can set as limit (see RTTimeUSecMax property we get) of 200ms. This is another reason why we can't make main threads real time, I think it's easy to reach this limit in the main thread.

So, to avoid being killed, I think we should handle the SIGXCPU signal and demote the thread to non-real time.
Comment 31 Carlos Garcia Campos 2021-06-28 03:39:52 PDT
Created attachment 432378 [details]
WIP patch
Comment 32 Patrick Griffis 2021-07-02 18:35:36 PDT
A portal for this was submitted upstream: https://github.com/flatpak/xdg-desktop-portal/pull/600
Comment 33 Patrick Griffis 2021-07-10 11:14:31 PDT
Comment on attachment 432378 [details]
WIP patch

View in context: https://bugs.webkit.org/attachment.cgi?id=432378&action=review

> Source/WTF/wtf/posix/ThreadingPOSIX.cpp:308
> +            LOG_ERROR("Failed to set sched policy %d for thread %d: %s", policy, threadHandle, strerror(error));

../Source/WTF/wtf/posix/ThreadingPOSIX.cpp:308:23: warning: format ‘%d’ expects argument of type ‘int’, but argument 6 has type ‘pthread_t’ {aka ‘long unsigned int’} [-Wformat=]
  308 |             LOG_ERROR("Failed to set sched policy %d for thread %d: %s", policy, threadHandle, strerror(error));
      |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~          ~~~~~~~~~~~~
      |                                                                                  |
      |                                                                                  pthread_t {aka long unsigned int}
Comment 34 Patrick Griffis 2021-07-10 12:21:20 PDT
Created attachment 433265 [details]
Add Realtime portal support

This is a quick patch adding support for the Realtime portal on top of your patch.

Its fairly straightforward but perhaps could be cleaned up.

I did have to increase the timeout otherwise the portal would hit it occasionally.
Comment 35 Patrick Griffis 2021-07-10 12:26:06 PDT
Created attachment 433266 [details]
Add Realtime portal support
Comment 36 Patrick Griffis 2021-07-10 12:38:35 PDT
Created attachment 433267 [details]
Add Realtime portal support
Comment 37 Philippe Normand 2021-07-17 13:47:09 PDT
Comment on attachment 433267 [details]
Add Realtime portal support

View in context: https://bugs.webkit.org/attachment.cgi?id=433267&action=review

> Source/WTF/wtf/glib/RealTimeKit.cpp:81
> +        GRefPtr<GVariant> result = g_dbus_proxy_call_sync (proxy.get(), "org.freedesktop.DBus.Properties.Get",

Missing adoptGRef() I think
Comment 38 Patrick Griffis 2021-07-18 13:23:10 PDT
Created attachment 433758 [details]
Add Realtime portal support
Comment 39 Carlos Garcia Campos 2021-10-15 06:41:09 PDT
Created attachment 441371 [details]
Patch
Comment 40 Michael Catanzaro 2021-10-15 08:14:45 PDT
Comment on attachment 441371 [details]
Patch

Lots of unhappy EWS here. :/
Comment 41 Michael Catanzaro 2021-10-15 08:21:35 PDT
*** Bug 227529 has been marked as a duplicate of this bug. ***
Comment 42 Philippe Normand 2021-10-15 08:35:40 PDT
(In reply to Patrick Griffis from comment #32)
> A portal for this was submitted upstream:
> https://github.com/flatpak/xdg-desktop-portal/pull/600

What's left to have this merged?
Comment 43 Carlos Garcia Campos 2021-10-18 05:19:42 PDT
Created attachment 441589 [details]
Patch
Comment 44 Carlos Garcia Campos 2021-10-18 06:25:03 PDT
Created attachment 441595 [details]
Patch
Comment 45 Michael Catanzaro 2021-10-18 08:59:30 PDT
Comment on attachment 441595 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441595&action=review

Looks mostly good. Couple issues:

> Source/WTF/wtf/linux/RealTimeThreads.cpp:128
> +        LOG_ERROR("Failed to set current thread as real time: %s", strerror(error));

I will complain about strerr below.

> Source/WTF/wtf/linux/RealTimeThreads.cpp:173
> +    RELEASE_ASSERT_NOT_REACHED();

Woah! We assert if an rtkit properly is invalid. That's not programmer error in WebKit, and it's not enough reason to crash either. Instead, call g_set_error_literal() with some canned message and return -1.

> Source/WTF/wtf/linux/RealTimeThreads.h:63
> +#if USE(GLIB)
> +    std::optional<GRefPtr<GDBusProxy>> m_realTimeKitProxy;
> +    RunLoop::Timer<RealTimeThreads> m_discardRealTimeKitProxyTimer;
> +#endif

You're careful to lock access to m_threadGroup, and to use m_enabled only on the main thread, but I see no such care with m_realTimeKitProxy and m_discardRealTimeKitProxyTimer. I think these both need to be guarded by a mutex. Correct?

> Source/WTF/wtf/posix/ThreadingPOSIX.cpp:308
> +            LOG_ERROR("Failed to set sched policy %d for thread %d: %s", policy, threadHandle, strerror(error));

Careful! strerror is not threadsafe and cannot be safely used *anywhere* in WebKit, or in other libraries too. (It returns a pointer to static data that, sadly, may be mutated by another thread.) Since g_strerror is not available here, you have to use strerror_r instead, which is frustratingly awkward to use since you have to manually allocate a buffer for the error string and then free it. But it's even worse, because glibc sabotaged everything by implementing it differently than POSIX, see http://club.cc.cmu.edu/~cmccabe/blog_strerror.html, and it's impossible to write code that is compatible with both versions, so it's all just such a huge pain to do correctly.

Anyway, there are a few other places in WebKit that make the same mistake, so you might want to define a wrapper in WTF that works similarly to g_strerror and avoids the problem. If you don't want to spend time on that, honestly the easiest thing to do would be to not print the error message, or just print it as an integer code. Actually using strerror_r requires extreme care. To get the standard version, you need (_POSIX_C_SOURCE >= 200112L) && ! _GNU_SOURCE.
Comment 46 Michael Catanzaro 2021-10-18 09:01:50 PDT
(In reply to Philippe Normand from comment #42)
> What's left to have this merged?

Unresolved discussion regarding the API: https://github.com/flatpak/xdg-desktop-portal/pull/600#discussion_r676350479
Comment 47 Patrick Griffis 2021-10-18 12:39:42 PDT
I'll get the PR merged today or tomorrow. I also made the API more simple so my earlier patches can be ignored. Its the same API just a different bus name now basically.
Comment 48 Patrick Griffis 2021-10-18 12:43:28 PDT
And I think it will be cleaner to just accept this patch without the portal backend until bug 231670 is merged which changes how portals are used.
Comment 49 Michael Catanzaro 2021-10-18 12:50:30 PDT
(In reply to Patrick Griffis from comment #48)
> And I think it will be cleaner to just accept this patch without the portal
> backend until bug 231670 is merged which changes how portals are used.

That already landed this morning.
Comment 50 Patrick Griffis 2021-10-18 13:10:02 PDT
Created attachment 441635 [details]
[PATCH] Update realtime portal usage

This is the diff to use the latest portal changes.
Comment 51 Michael Catanzaro 2021-10-18 15:40:40 PDT
(In reply to Michael Catanzaro from comment #45)
> Anyway, there are a few other places in WebKit that make the same mistake,
> so you might want to define a wrapper in WTF that works similarly to
> g_strerror and avoids the problem. If you don't want to spend time on that,
> honestly the easiest thing to do would be to not print the error message, or
> just print it as an integer code. Actually using strerror_r requires extreme
> care. To get the standard version, you need (_POSIX_C_SOURCE >= 200112L) &&
> ! _GNU_SOURCE.

Adrian inspired me to do something about this: bug #231913.
Comment 52 Carlos Garcia Campos 2021-10-19 02:29:02 PDT
Comment on attachment 441595 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=441595&action=review

>> Source/WTF/wtf/linux/RealTimeThreads.cpp:128
>> +        LOG_ERROR("Failed to set current thread as real time: %s", strerror(error));
> 
> I will complain about strerr below.

This is always called in the main thread.

>> Source/WTF/wtf/linux/RealTimeThreads.cpp:173
>> +    RELEASE_ASSERT_NOT_REACHED();
> 
> Woah! We assert if an rtkit properly is invalid. That's not programmer error in WebKit, and it's not enough reason to crash either. Instead, call g_set_error_literal() with some canned message and return -1.

That would mean it's not the rtkit service we expect to talk to, we always expect the properties to be either a int64 or int32.

>> Source/WTF/wtf/linux/RealTimeThreads.h:63
>> +#endif
> 
> You're careful to lock access to m_threadGroup, and to use m_enabled only on the main thread, but I see no such care with m_realTimeKitProxy and m_discardRealTimeKitProxyTimer. I think these both need to be guarded by a mutex. Correct?

RunLoop::Timer is thread safe. it's a *main loop* timer, so the callback is always called in the main thread. And m_realTimeKitProxy is also called in the main thread only, because promoteThreadToRealTime() is always called in the main thread.

>> Source/WTF/wtf/posix/ThreadingPOSIX.cpp:308
>> +            LOG_ERROR("Failed to set sched policy %d for thread %d: %s", policy, threadHandle, strerror(error));
> 
> Careful! strerror is not threadsafe and cannot be safely used *anywhere* in WebKit, or in other libraries too. (It returns a pointer to static data that, sadly, may be mutated by another thread.) Since g_strerror is not available here, you have to use strerror_r instead, which is frustratingly awkward to use since you have to manually allocate a buffer for the error string and then free it. But it's even worse, because glibc sabotaged everything by implementing it differently than POSIX, see http://club.cc.cmu.edu/~cmccabe/blog_strerror.html, and it's impossible to write code that is compatible with both versions, so it's all just such a huge pain to do correctly.
> 
> Anyway, there are a few other places in WebKit that make the same mistake, so you might want to define a wrapper in WTF that works similarly to g_strerror and avoids the problem. If you don't want to spend time on that, honestly the easiest thing to do would be to not print the error message, or just print it as an integer code. Actually using strerror_r requires extreme care. To get the standard version, you need (_POSIX_C_SOURCE >= 200112L) && ! _GNU_SOURCE.

Yes, I followed the pattern of the previous LOG_ERROR in this function...
Comment 53 Carlos Garcia Campos 2021-10-19 03:04:04 PDT
Created attachment 441705 [details]
Patch
Comment 54 Michael Catanzaro 2021-10-19 07:34:07 PDT
(In reply to Carlos Garcia Campos from comment #52)
> This is always called in the main thread.

That doesn't mean it's OK to use.

Anyway, we have bug #231913 for this. I suppose I can always just update the patch there if this one lands first.

> >> Source/WTF/wtf/linux/RealTimeThreads.cpp:173
> >> +    RELEASE_ASSERT_NOT_REACHED();
> > 
> > Woah! We assert if an rtkit properly is invalid. That's not programmer error in WebKit, and it's not enough reason to crash either. Instead, call g_set_error_literal() with some canned message and return -1.
> 
> That would mean it's not the rtkit service we expect to talk to, we always
> expect the properties to be either a int64 or int32.

That's not a WebKit programmer error, so assert is definitely inappropriate.

Crashing seems inappropriate too. What's wrong with returning a normal error.

> RunLoop::Timer is thread safe. it's a *main loop* timer, so the callback is
> always called in the main thread. And m_realTimeKitProxy is also called in
> the main thread only, because promoteThreadToRealTime() is always called in
> the main thread.

OK, can you add some ASSERT(isMainThread()) then? That would make the safety easier to see.
Comment 55 Michael Catanzaro 2021-10-19 07:36:32 PDT
Comment on attachment 441705 [details]
Patch

OK, I see you handled everything except the strerror, which we can address separately.
Comment 56 Carlos Garcia Campos 2021-10-20 01:26:52 PDT
Committed r284525 (243268@main): <https://commits.webkit.org/243268@main>
Comment 57 Radar WebKit Bug Importer 2021-10-20 01:27:17 PDT
<rdar://problem/84452386>