Bug 209942 - [GTK][WPE] Replace fopen/fclose by fopen/fseek functions in MemoryPressureMonitor
Summary: [GTK][WPE] Replace fopen/fclose by fopen/fseek functions in MemoryPressureMon...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Pablo Saavedra
URL:
Keywords: InRadar
Depends on:
Blocks: 210345 210346
  Show dependency treegraph
 
Reported: 2020-04-02 17:04 PDT by Pablo Saavedra
Modified: 2020-04-10 23:57 PDT (History)
7 users (show)

See Also:


Attachments
patch (17.72 KB, patch)
2020-04-02 17:32 PDT, Pablo Saavedra
no flags Details | Formatted Diff | Diff
patch (18.29 KB, patch)
2020-04-05 02:56 PDT, Pablo Saavedra
no flags Details | Formatted Diff | Diff
patch (18.30 KB, patch)
2020-04-05 23:49 PDT, Pablo Saavedra
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pablo Saavedra 2020-04-02 17:04:59 PDT
The current implementation is executing open/close calls for several files /proc/meminfo, /proc/self/cgroup, ... in a infinite loop.

It could be good idea to explore a refactoring using fseek/rewind/fsetpos to avoid unnecessary those open/close system calls.
Comment 1 Pablo Saavedra 2020-04-02 17:32:41 PDT
Created attachment 395329 [details]
patch
Comment 2 Pablo Saavedra 2020-04-02 23:26:57 PDT
(In reply to Pablo Saavedra from comment #0)
> to avoid unnecessary those open/close system calls.

... to avoid those unnecessary  open/close system calls
Comment 3 Carlos Garcia Campos 2020-04-03 02:12:53 PDT
Do we really want to keep those file descriptors open indefinitely?
Comment 4 Pablo Saavedra 2020-04-03 03:43:41 PDT
(In reply to Carlos Garcia Campos from comment #3)
> Do we really want to keep those file descriptors open indefinitely?

The current logic opens and closes thise files every 5 seconds. In the valance I could make more sense just keep then indefinitely open.
Comment 5 Carlos Alberto Lopez Perez 2020-04-03 05:24:00 PDT
(In reply to Pablo Saavedra from comment #4)
> (In reply to Carlos Garcia Campos from comment #3)
> > Do we really want to keep those file descriptors open indefinitely?
> 
> The current logic opens and closes thise files every 5 seconds. In the
> valance I could make more sense just keep then indefinitely open.

Yes, my thinking its that its a better idea to keep them open forever than to open/close them forever each few seconds.

However, I realize now that it seems possible to move a running process from a memory cgroup to a different cgroup memory.

 Ex: https://docs.fedoraproject.org/en-US/Fedora/16/html/Resource_Management_Guide/sec-Moving_a_Process_to_a_Control_Group.html

I think that we should check for this and open the cgroup sysfs files of the new memory cgroup if it happens.

Another thing to take into account its that we can start running without memory cgroup (so we start getting the memory info from /proc/meminfo), but then the admin puts the process inside a memory cgroup after some seconds after we have started. So we have started the busy loop of the thread thinking we are not running inside a memory cgroup, but that its not longer true.

So I see no better option than to re-check in each loop iteration if we are now in a memory cgroup or a different one. And for doing that it seems needed to re-check the memory cgroup sysfs related files on each loop.
Comment 6 Carlos Alberto Lopez Perez 2020-04-03 05:30:14 PDT
(In reply to Carlos Alberto Lopez Perez from comment #5)
> (In reply to Pablo Saavedra from comment #4)
> > (In reply to Carlos Garcia Campos from comment #3)
> > > Do we really want to keep those file descriptors open indefinitely?
> > 
> > The current logic opens and closes thise files every 5 seconds. In the
> > valance I could make more sense just keep then indefinitely open.
> 
> Yes, my thinking its that its a better idea to keep them open forever than
> to open/close them forever each few seconds.
> 

Or maybe its not a better idea.

In the end you are replacing this:

-iteration-loop: fclose()/fopen() syscalls
+iteration-loop: fseek()/fflush() syscalls


So the number of syscalls per loop its the same.

But before you were avoiding having open the descriptors when not needed.
Comment 7 Pablo Saavedra 2020-04-03 09:53:07 PDT
(In reply to Carlos Alberto Lopez Perez from comment #5)
> (In reply to Pablo Saavedra from comment #4)
> > (In reply to Carlos Garcia Campos from comment #3)
> > > Do we really want to keep those file descriptors open indefinitely?
> > 
> > The current logic opens and closes thise files every 5 seconds. In the
> > valance I could make more sense just keep then indefinitely open.
> 
> Yes, my thinking its that its a better idea to keep them open forever than
> to open/close them forever each few seconds.
> 
> However, I realize now that it seems possible to move a running process from
> a memory cgroup to a different cgroup memory.
> 
>  Ex:
> https://docs.fedoraproject.org/en-US/Fedora/16/html/
> Resource_Management_Guide/sec-Moving_a_Process_to_a_Control_Group.html
> 
> I think that we should check for this and open the cgroup sysfs files of the
> new memory cgroup if it happens.

yes, this works. When this happens the `memory:` path in the /proc/self/cgroup is updated and the loop reads the new value again then if the memoryControllerPath differs from the m_cgroupMemoryControllerPath the setMemoryControllerPath() disposes the old cgroups files (aka close()) and open the new cgroups memory files from the new memory controller path:

 356void CGroupMemoryController::setMemoryControllerPath(CString memoryControllerPath)
 357{
 358    if (memoryControllerPath == m_cgroupMemoryControllerPath)
 359        return;

> 
> Another thing to take into account its that we can start running without
> memory cgroup (so we start getting the memory info from /proc/meminfo), but
> then the admin puts the process inside a memory cgroup after some seconds
> after we have started. So we have started the busy loop of the thread
> thinking we are not running inside a memory cgroup, but that its not longer
> true.
> 

The same that above ^. Already tested.

> So I see no better option than to re-check in each loop iteration if we are
> now in a memory cgroup or a different one. And for doing that it seems
> needed to re-check the memory cgroup sysfs related files on each loop.
Comment 8 Pablo Saavedra 2020-04-03 12:47:39 PDT
(In reply to Carlos Alberto Lopez Perez from comment #6)
> (In reply to Carlos Alberto Lopez Perez from comment #5)
> > (In reply to Pablo Saavedra from comment #4)
> > > (In reply to Carlos Garcia Campos from comment #3)
> > > > Do we really want to keep those file descriptors open indefinitely?
> > > 
> > > The current logic opens and closes thise files every 5 seconds. In the
> > > valance I could make more sense just keep then indefinitely open.
> > 
> > Yes, my thinking its that its a better idea to keep them open forever than
> > to open/close them forever each few seconds.
> > 
> 
> Or maybe its not a better idea.
> 
> In the end you are replacing this:
> 
> -iteration-loop: fclose()/fopen() syscalls
> +iteration-loop: fseek()/fflush() syscalls
> 
> 
> So the number of syscalls per loop its the same.
> 
> But before you were avoiding having open the descriptors when not needed.

Well, those descriptors are going to be need every 5 seconds so. It is going keep those descriptors open is going to be a problem it is going to be a problem as well at the moment of reopen the files.
Comment 9 Adrian Perez 2020-04-03 13:31:52 PDT
(In reply to Carlos Alberto Lopez Perez from comment #6)
> (In reply to Carlos Alberto Lopez Perez from comment #5)
> > (In reply to Pablo Saavedra from comment #4)
> > > (In reply to Carlos Garcia Campos from comment #3)
> > > > Do we really want to keep those file descriptors open indefinitely?
> > > 
> > > The current logic opens and closes thise files every 5 seconds. In the
> > > valance I could make more sense just keep then indefinitely open.
> > 
> > Yes, my thinking its that its a better idea to keep them open forever than
> > to open/close them forever each few seconds.
> > 
> 
> Or maybe its not a better idea.
> 
> In the end you are replacing this:
> 
> -iteration-loop: fclose()/fopen() syscalls
> +iteration-loop: fseek()/fflush() syscalls
> 
> 
> So the number of syscalls per loop its the same.

Not quite, the situation is still very favorable towards keeping the files
always open:

 - fflush() only empties the *userspace* buffers maintained by the libc
   for the FILE objects—it does not do any system call.

 - The open() system call —used by fopen()— is more expensive than the
   lseek() system call —used by fseek()— because opening a file needs
   allocating kernel-side resources and checking user credentials and
   file access permissions (none of which is needed for seeking).

 - If we keep the files always open, it is possible to continue checking
   memory status even if the amount of available file descriptors is
   exhausted. If all possible file descriptors are used between closing
   and trying to open files again, opening will fail.

> But before you were avoiding having open the descriptors when not needed.

This is not a particularly good argument. While true, we *know* that the
file descriptors *will be needed again in some seconds*, so they *will*
be needed!
Comment 10 Adrian Perez 2020-04-03 13:33:04 PDT
(In reply to Adrian Perez from comment #9)
> (In reply to Carlos Alberto Lopez Perez from comment #6)
> > (In reply to Carlos Alberto Lopez Perez from comment #5)
> > > (In reply to Pablo Saavedra from comment #4)
> > > > (In reply to Carlos Garcia Campos from comment #3)
> > > > > Do we really want to keep those file descriptors open indefinitely?
> > > > 
> > > > The current logic opens and closes thise files every 5 seconds. In the
> > > > valance I could make more sense just keep then indefinitely open.
> > > 
> > > Yes, my thinking its that its a better idea to keep them open forever than
> > > to open/close them forever each few seconds.
> > > 
> > 
> > Or maybe its not a better idea.
> > 
> > In the end you are replacing this:
> > 
> > -iteration-loop: fclose()/fopen() syscalls
> > +iteration-loop: fseek()/fflush() syscalls
> > 
> > 
> > So the number of syscalls per loop its the same.
> 
> Not quite, the situation is still very favorable towards keeping the files
> always open:
> 
>  - fflush() only empties the *userspace* buffers maintained by the libc
>    for the FILE objects—it does not do any system call.
> 
>  - The open() system call —used by fopen()— is more expensive than the
>    lseek() system call —used by fseek()— because opening a file needs
>    allocating kernel-side resources and checking user credentials and
>    file access permissions (none of which is needed for seeking).

Also:

   - The close() system call —used by fclose()­— is slightly more expensive
     that a seek operation because it has to deallocate kernel resources.

>  - If we keep the files always open, it is possible to continue checking
>    memory status even if the amount of available file descriptors is
>    exhausted. If all possible file descriptors are used between closing
>    and trying to open files again, opening will fail.
> 
> > But before you were avoiding having open the descriptors when not needed.
> 
> This is not a particularly good argument. While true, we *know* that the
> file descriptors *will be needed again in some seconds*, so they *will*
> be needed!
Comment 11 Adrian Perez 2020-04-03 13:47:13 PDT
Comment on attachment 395329 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=395329&action=review

I think this is a nice incremental improvement to the existing
code, but it needs a bit of work. I am particularly concerned
about the missing error checking when opening files. Could you
please look into the suggestions below?

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:67
> +    fflush(zoneInfoFile);

Why do you need to use fflush() here? The fseek() right above is
responsible to arrange whatever is needed internally inside the
libc to make the next read return data starting at the first byte
of the file.

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:180
> +    fflush(cgroupControllerFile);

Same here, flushing after seeking shouldn't be needed.

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:212
> +    fflush(memInfoFile);

…and the same here.

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:327
> +        FILE* cgroupControllerFile = fopen(s_procSelfCgroup, "r");

This is missing error checking. If the files cannot be opened
the FILE* variables will be null and trying to read from them
will be undefined behaviour. We need checks here, and do one of:

 1. Disable the memory pressure monitor. Probably not ideal.
 2. Retry opening the files periodically, optionally up to a
    certain maximum amount of attempts.
 3. Retry opening the files with exponential backoff.

For simplicity, I think we could go with option (2.) and retrying
every 5s, which is the same interval at which memory status reports
are gathered.
Comment 12 Michael Catanzaro 2020-04-04 07:04:14 PDT
(Would make sense to use GMemoryMonitor on GLib ports if GLib 2.64 is available at build time.)
Comment 13 Carlos Alberto Lopez Perez 2020-04-04 09:02:37 PDT
(In reply to Michael Catanzaro from comment #12)
> (Would make sense to use GMemoryMonitor on GLib ports if GLib 2.64 is
> available at build time.)

That's interesting. Didn't know about it. 

I foresaw some problems with it:
 - Requires a daemon (Low Memory Monitor) to be installed 
 - Requires Linux Kernel 5.2 or newer

So maybe it makes sense to use it if available, but i think we should keep the current code at least as a fallback for older systems or for systems without that daemon installed.
Comment 14 Pablo Saavedra 2020-04-05 02:56:46 PDT
Created attachment 395496 [details]
patch
Comment 15 Pablo Saavedra 2020-04-05 03:04:08 PDT
(In reply to Adrian Perez from comment #11)
> Comment on attachment 395329 [details]
> patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=395329&action=review
> 
> I think this is a nice incremental improvement to the existing
> code, but it needs a bit of work. I am particularly concerned
> about the missing error checking when opening files. Could you
> please look into the suggestions below?
> 
> > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:67
> > +    fflush(zoneInfoFile);
> 
> Why do you need to use fflush() here? The fseek() right above is
> responsible to arrange whatever is needed internally inside the
> libc to make the next read return data starting at the first byte
> of the file.
> 
> > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:180
> > +    fflush(cgroupControllerFile);
> 
> Same here, flushing after seeking shouldn't be needed.
> 
> > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:212
> > +    fflush(memInfoFile);
> 

Fixed. All files with fopen() are opened with a default allocated buffer (fully buffered) but stderr. I just set a null buffer for those files (setbuf(..., nullptr)). With this change the flush is not needed.

With this change, only the `fseek()` is called (instance of fopen()/fclose() for each iteration). 

> …and the same here.
> 
> > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:327
> > +        FILE* cgroupControllerFile = fopen(s_procSelfCgroup, "r");
> 
> This is missing error checking. If the files cannot be opened
> the FILE* variables will be null and trying to read from them
> will be undefined behaviour. We need checks here, and do one of:
> 
>  1. Disable the memory pressure monitor. Probably not ideal.
>  2. Retry opening the files periodically, optionally up to a
>     certain maximum amount of attempts.
>  3. Retry opening the files with exponential backoff.
> 
> For simplicity, I think we could go with option (2.) and retrying
> every 5s, which is the same interval at which memory status reports
> are gathered.

Please, check the new version of the patch. I added the open of those files into the loop also checking the fopen() result.
Comment 16 Pablo Saavedra 2020-04-05 03:11:30 PDT
(In reply to Carlos Alberto Lopez Perez from comment #13)
> (In reply to Michael Catanzaro from comment #12)
> > (Would make sense to use GMemoryMonitor on GLib ports if GLib 2.64 is
> > available at build time.)
> 
> That's interesting. Didn't know about it. 
> 
> I foresaw some problems with it:
>  - Requires a daemon (Low Memory Monitor) to be installed 
>  - Requires Linux Kernel 5.2 or newer
> 
> So maybe it makes sense to use it if available, but i think we should keep
> the current code at least as a fallback for older systems or for systems
> without that daemon installed.


I think is a good idea replacement for the data coming from `/proc/memInfo`.

But still I see the monitoring of the cgroup memory controller files necessary. What, I understand GMemoryMonitor monitor memory pressure information coming from the kernel but not deals with the cgroups memory limits.
Comment 17 Pablo Saavedra 2020-04-05 23:49:47 PDT
Created attachment 395550 [details]
patch
Comment 18 Adrian Perez 2020-04-09 17:40:33 PDT
Comment on attachment 395550 [details]
patch

I have today while re-reviewing this that there is a perfect way of
avoiding strtok(), please check the comments below. Thanks a lot for
your patience providing updates to the patch, it is much appreciated,
and I think we are quite close to be able to land this :)

View in context: https://bugs.webkit.org/attachment.cgi?id=395550&action=review

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:76
>          if (!strncmp(line, "Node", 4)) {

I have noticed now that this is only searching for the “low”
value inside the first “Node” section, without taking into
account the kind of the memory zone.

Shouldn't we be searching for the “Normal” zone node, instead
of the first one seen? In my laptop here the first zone is a
DMA memory area:

  Node 0, zone      DMA
    per-node stats
        nr_inactive_anon 277959
        nr_active_anon 863174
        ...
    pages free      3951
          min       13 
          low       16
          high      19
          ...

…and picking “16” as the “low” value seems wrong. We will want
to fix this to find the node which describes the “Normal” zone:

  Node 0, zone     Normal
    pages free     252746
          min      14907
          low      26572
          high     31069
          ...

Related question: What if there are more multiple nodes which
describe nodes of “Normal” type? Should the “low” values be
added together in that case?

BTW, feel free to handle fixing this in a separate bug :)

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:173
>  // 1:name=systemd:/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service

These lines look quite suitable to be parsed with fscanf():

  while (!feof(inputFile)) {
      char name[40 + 1];
      char path[PATH_MAX + 1];
      fscanf("%*u:%40[^:]:%" STRINGIFY(PATH_MAX) "[^\n]", name, path);
      if (!strcmp(name, "name=systemd"))
          return CString(path);
  }

  // Not found
  return CString();

With the STRINGIFY() macro being the usual preprocessor trick:

  #define STRINGIFY_EXPANDED(val) #val
  #define STRINGIFY(val) STRINGIFY_EXPANDED(val)

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:217
> +    while (char* line = fgets(buffer, 128, memInfoFile)) {

Idea: instead of going line-by-line and using evil strtok(), take
advantage of… fscanf() again!

   memoryTotal = memoryFree = activeFile = inactiveFile = slabReclaimable = notSet;
   while (!feof(memInfoFile)) {
       char token[50 + 1];
       size_t amount;
       if (fscanf(memInfoFile, "%50s%zukB", token, &amount) != 2)
           break;

       if (!strcmp(token, "MemTotal:"))
           memoryTotal = amount;
       else if (!strcmp(token, "MemFree:"))
           memoryFree = amount;
       else ...

      if (memoryTotal != notSet && memoryFree != notSet && activeFile != notSet && inactiveFile != notSet && slabReclaimable != notSet)
          break;
   }

Note how this avoids needing to manually split the input in tokens
ourselves: instead we let fscanf() do the hard work of parsing the
input and doing numeric conversions.

> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:420
> +    return value;

This whole function can be implemented as:

  size_t CGroupMemoryController::getCgroupFileValue(FILE* file)
  {
      size_t value;
      return (file && fscanf(file, "%zu", &value) == 1) ? value : notSet;
  }
Comment 19 Pablo Saavedra 2020-04-10 10:36:03 PDT
(In reply to Adrian Perez from comment #18)
> Comment on attachment 395550 [details]


Thanks Adrián for your review and, of course, for your suggestions. I'd prefer avoid the usage this ticket to fix those things. Instance of that I created this 2 new tickets to address the issues described by you:

* [GTK][WPE] lowWatermarkPages() in MemoryPressureMonitor.cpp only searches the "low" value inside the first "Node" section: https://bugs.webkit.org/show_bug.cgi?id=210345

* [GTK][WPE] Replace evil strtok() calls with fscanf() in MemoryPressureMonitor.cpp: https://bugs.webkit.org/show_bug.cgi?id=210346



Let's keep the scope of this ticket in the replacement of the fopen/fclose logic with fopen/fseek. WDYT?
Comment 20 Adrian Perez 2020-04-10 14:13:10 PDT
(In reply to Pablo Saavedra from comment #19)
> (In reply to Adrian Perez from comment #18)
> > Comment on attachment 395550 [details]
> 
> 
> Thanks Adrián for your review and, of course, for your suggestions. I'd
> prefer avoid the usage this ticket to fix those things. Instance of that I
> created this 2 new tickets to address the issues described by you:
> 
> * [GTK][WPE] lowWatermarkPages() in MemoryPressureMonitor.cpp only searches
> the "low" value inside the first "Node" section:
> https://bugs.webkit.org/show_bug.cgi?id=210345
> 
> * [GTK][WPE] Replace evil strtok() calls with fscanf() in
> MemoryPressureMonitor.cpp: https://bugs.webkit.org/show_bug.cgi?id=210346
>
> Let's keep the scope of this ticket in the replacement of the fopen/fclose
> logic with fopen/fseek. WDYT?

Sounds good to me, let's go ahead and do one step at a time 👍️
Comment 21 EWS 2020-04-10 23:56:29 PDT
Committed r259923: <https://trac.webkit.org/changeset/259923>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 395550 [details].
Comment 22 Radar WebKit Bug Importer 2020-04-10 23:57:15 PDT
<rdar://problem/61625949>