210346 – [GTK][WPE] Replace evil strtok() calls with fscanf() in MemoryPressureMonitor.cpp

RESOLVED FIXED 210346

[GTK][WPE] Replace evil strtok() calls with fscanf() in MemoryPressureMonitor.cpp

https://bugs.webkit.org/show_bug.cgi?id=210346

Summary [GTK][WPE] Replace evil strtok() calls with fscanf() in MemoryPressureMonitor...

Pablo Saavedra

Reported 2020-04-10 10:29:50 PDT

Related to https://bugs.webkit.org/show_bug.cgi?id=209942 See details in https://bugs.webkit.org/show_bug.cgi?id=209942#c18 > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:173 > // 1:name=systemd:/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service These lines look quite suitable to be parsed with fscanf(): while (!feof(inputFile)) { char name[40 + 1]; char path[PATH_MAX + 1]; fscanf("%*u:%40[^:]:%" STRINGIFY(PATH_MAX) "[^\n]", name, path); if (!strcmp(name, "name=systemd")) return CString(path); } // Not found return CString(); With the STRINGIFY() macro being the usual preprocessor trick: #define STRINGIFY_EXPANDED(val) #val #define STRINGIFY(val) STRINGIFY_EXPANDED(val) > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:217 > + while (char* line = fgets(buffer, 128, memInfoFile)) { Idea: instead of going line-by-line and using evil strtok(), take advantage of… fscanf() again! memoryTotal = memoryFree = activeFile = inactiveFile = slabReclaimable = notSet; while (!feof(memInfoFile)) { char token[50 + 1]; size_t amount; if (fscanf(memInfoFile, "%50s%zukB", token, &amount) != 2) break; if (!strcmp(token, "MemTotal:")) memoryTotal = amount; else if (!strcmp(token, "MemFree:")) memoryFree = amount; else ... if (memoryTotal != notSet && memoryFree != notSet && activeFile != notSet && inactiveFile != notSet && slabReclaimable != notSet) break; } Note how this avoids needing to manually split the input in tokens ourselves: instead we let fscanf() do the hard work of parsing the input and doing numeric conversions. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:420 > + return value; This whole function can be implemented as: size_t CGroupMemoryController::getCgroupFileValue(FILE* file) { size_t value; return (file && fscanf(file, "%zu", &value) == 1) ? value : notSet; }

Attachments
patch (8.82 KB, patch) 2020-04-12 03:18 PDT, Pablo Saavedra	no flags	Details Formatted Diff Diff
patch (8.76 KB, patch) 2020-04-13 23:32 PDT, Pablo Saavedra	no flags	Details Formatted Diff Diff
patch (9.09 KB, patch) 2020-04-18 06:33 PDT, Pablo Saavedra	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Pablo Saavedra

Comment 1 2020-04-12 03:18:16 PDT

Created attachment 396217 [details] patch

Pablo Saavedra

Comment 2 2020-04-12 03:21:52 PDT

Comment on attachment 396217 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=396217&action=review > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:66 > static size_t lowWatermarkPages(FILE* zoneInfoFile) See notes in comment #1 from 210345 https://bugs.webkit.org/show_bug.cgi?id=210345#c1. Probably we want just to remove this code because memAvailable is already in /proc/meminfo since Linux 3.14 LTS and this is already EOL since 2016 so there is no reason to maintain this code.

Adrian Perez

Comment 3 2020-04-13 10:06:56 PDT

Comment on attachment 396217 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=396217&action=review I think this is going in the right direction (thanks!) but I have a few questions and comment, please check them below 😄️ >> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:66 >> static size_t lowWatermarkPages(FILE* zoneInfoFile) > > See notes in comment #1 from 210345 https://bugs.webkit.org/show_bug.cgi?id=210345#c1. Probably we want just to remove this code because memAvailable is already in /proc/meminfo since Linux 3.14 LTS and this is already EOL since 2016 so there is no reason to maintain this code. We would like to decide whether we want to keep supporting kernels older than 3.14 — but I think it's better to play safe and keep the code for now. If (or when) we decide to remove it, we can use a new bug for that. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:77 > + r = fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer); Well, 128 bytes sounds a bit too much for the zone names, but at the same time it's not that much… so let's just leave it as is 😉️ Note that you should use “%127[^\n]” because fscanf() will read at most 127 characters and *then* add the '\0' terminator, which is character number 128. When using fscanf() I find it clearer to use something like: char buffer[128 + 1]; // +1 makes it clearer that we need space for \0 fscanf(f, "%128s", buffer); (Actually, you did just this below!) > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:80 > + r = fscanf(zoneInfoFile, "%s", buffer); This is unsafe, please indicate a field width corresponding to the size of the buffer as part of the format string: r = fscanf(zoneInfoFile, "%127s", buffer); ^^^ > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { How does this work? The “low” keyword is a few tokens *after* the zone name. Wouldn't you need a nested loop here reading tokens until “low” is seen? > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:168 > + // int scanResult = fscanf(cgroupControllerFile, "%*u:%40[^:]:%4096[^\n]", name, path); Please remove this commented statement.

Pablo Saavedra

Comment 4 2020-04-13 23:27:51 PDT

(In reply to Adrian Perez from comment #3) > Comment on attachment 396217 [details] > patch > > > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { > > How does this work? The “low” keyword is a few tokens *after* the zone name. > Wouldn't you need a nested loop here reading tokens until “low” is seen? > The fscanf() reads the input stream file until the argument list is successfully filled. In the context of the `while (!feof(zoneInfoFile))` loop: - the first `fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer);` iterates the `Node` sections. - Then, when we found a Normal node, we start to read each single `fscanf(zoneInfoFile, "%128s", buffer);` until find the `low` token. - Finally, we read the next token which is the actual `low` value. In this second scanning of the file I read the content one by one because the format of each row is not consistent (2, 3 or 6 values): Node 0, zone Normal pages free 27303 min 20500 low 24089 high 27678 spanned 3414016 present 3414016 managed 3337293 protection: (0, 0, 0, 0, 0)

Pablo Saavedra

Comment 5 2020-04-13 23:32:45 PDT

Created attachment 396384 [details] patch

Adrian Perez

Comment 6 2020-04-17 08:15:26 PDT

(In reply to Pablo Saavedra from comment #4) > (In reply to Adrian Perez from comment #3) > > Comment on attachment 396217 [details] > > patch > > > > > > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > > > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { > > > > How does this work? The “low” keyword is a few tokens *after* the zone name. > > Wouldn't you need a nested loop here reading tokens until “low” is seen? > > > > The fscanf() reads the input stream file until the argument list is > successfully filled. In the context of the `while (!feof(zoneInfoFile))` > loop: > > - the first `fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer);` > iterates the `Node` sections. > - Then, when we found a Normal node, we start to read each single > `fscanf(zoneInfoFile, "%128s", buffer);` until find the `low` token. > - Finally, we read the next token which is the actual `low` value. > > In this second scanning of the file I read the content one by one because > the format of each row is not consistent (2, 3 or 6 values): > > Node 0, zone Normal > pages free 27303 > min 20500 > low 24089 > high 27678 > spanned 3414016 > present 3414016 > managed 3337293 > protection: (0, 0, 0, 0, 0) Thanks for the explanation, I get it now, but it's definitely not easy to understand taking a quick look. Knowing that the “low” keyword must be always contained inside a “Node” section, I was expecting a nested loop (the outer one only searching for “Node ... Normal” lines, the inner one picking the value of the “low“ keyword). I guess this is fine as-is but I would add a comment explaining how it works at the beginning of the loop.

Adrian Perez

Comment 7 2020-04-17 08:23:29 PDT

Comment on attachment 396384 [details] patch Other than the hardcoded buffer sizes, the parsing logic looks good, and it's nice that using fscanf() in all this code avoids the need for temporary memory allocations — which is important when there isn't much left :) View in context: https://bugs.webkit.org/attachment.cgi?id=396384&action=review > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:69 > + char buffer[128 + 1]; Please use a constant for the buffer size: #define ZONEINFO_TOKEN_BUFFER_SIZE 128 and then here: char buffer[ZONEINFO_TOKEN_BUFFER_SIZE + 1]; > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:77 > + r = fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer); …then here you can use: int r = fscanf(zoneInfoFile, "Node %*u, zone %" STRINGIFY(ZONEINFO_TOKEN_BUFFER_SIZE) "[^\n]\n", buffer); instead of hardcoding the buffer sizeo in the input format string. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:80 > + r = fscanf(zoneInfoFile, "%128s", buffer); Here you could use fscanf(…, "%" STRINGIFY(ZONEINFO_TOKEN_BUFFER_SIZE) "s", buffer) instead of hardcoding the token buffer length. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:166 > + char name[40 + 1]; Same here, you could define CGROUP_NAME_BUFFER_SIZE. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:190 > + char token[50 + 1]; And here too: MEMINFO_TOKEN_BUFFER_SIZE.

Pablo Saavedra

Comment 8 2020-04-18 06:33:32 PDT

Created attachment 396845 [details] patch

EWS

Comment 9 2020-04-18 10:17:17 PDT

Committed r260318: <https://trac.webkit.org/changeset/260318> All reviewed patches have been landed. Closing bug and clearing flags on attachment 396845 [details].

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Linux

Product WebKit

Component WebKitGTK

Assignee

Pablo Saavedra

Reported

2020-04-10 10:29 PDT

Modified

2020-04-18 10:17 PDT History

CC List

6 users Show

URL

Keywords

Depends on

209942

Blocks

210345

Dependencies

tree graph