Bug 210346

Summary: [GTK][WPE] Replace evil strtok() calls with fscanf() in MemoryPressureMonitor.cpp
Product: WebKit Reporter: Pablo Saavedra <psaavedra>
Component: WebKitGTKAssignee: Pablo Saavedra <psaavedra>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, aperez, bugs-noreply, cgarcia, clopez, darin
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Linux   
Bug Depends on: 209942    
Bug Blocks: 210345    
Attachments:
Description Flags
patch
none
patch
none
patch none

Pablo Saavedra
Reported 2020-04-10 10:29:50 PDT
Related to https://bugs.webkit.org/show_bug.cgi?id=209942 See details in https://bugs.webkit.org/show_bug.cgi?id=209942#c18 > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:173 > // 1:name=systemd:/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service These lines look quite suitable to be parsed with fscanf(): while (!feof(inputFile)) { char name[40 + 1]; char path[PATH_MAX + 1]; fscanf("%*u:%40[^:]:%" STRINGIFY(PATH_MAX) "[^\n]", name, path); if (!strcmp(name, "name=systemd")) return CString(path); } // Not found return CString(); With the STRINGIFY() macro being the usual preprocessor trick: #define STRINGIFY_EXPANDED(val) #val #define STRINGIFY(val) STRINGIFY_EXPANDED(val) > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:217 > + while (char* line = fgets(buffer, 128, memInfoFile)) { Idea: instead of going line-by-line and using evil strtok(), take advantage of… fscanf() again! memoryTotal = memoryFree = activeFile = inactiveFile = slabReclaimable = notSet; while (!feof(memInfoFile)) { char token[50 + 1]; size_t amount; if (fscanf(memInfoFile, "%50s%zukB", token, &amount) != 2) break; if (!strcmp(token, "MemTotal:")) memoryTotal = amount; else if (!strcmp(token, "MemFree:")) memoryFree = amount; else ... if (memoryTotal != notSet && memoryFree != notSet && activeFile != notSet && inactiveFile != notSet && slabReclaimable != notSet) break; } Note how this avoids needing to manually split the input in tokens ourselves: instead we let fscanf() do the hard work of parsing the input and doing numeric conversions. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:420 > + return value; This whole function can be implemented as: size_t CGroupMemoryController::getCgroupFileValue(FILE* file) { size_t value; return (file && fscanf(file, "%zu", &value) == 1) ? value : notSet; }
Attachments
patch (8.82 KB, patch)
2020-04-12 03:18 PDT, Pablo Saavedra
no flags
patch (8.76 KB, patch)
2020-04-13 23:32 PDT, Pablo Saavedra
no flags
patch (9.09 KB, patch)
2020-04-18 06:33 PDT, Pablo Saavedra
no flags
Pablo Saavedra
Comment 1 2020-04-12 03:18:16 PDT
Pablo Saavedra
Comment 2 2020-04-12 03:21:52 PDT
Comment on attachment 396217 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=396217&action=review > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:66 > static size_t lowWatermarkPages(FILE* zoneInfoFile) See notes in comment #1 from 210345 https://bugs.webkit.org/show_bug.cgi?id=210345#c1. Probably we want just to remove this code because memAvailable is already in /proc/meminfo since Linux 3.14 LTS and this is already EOL since 2016 so there is no reason to maintain this code.
Adrian Perez
Comment 3 2020-04-13 10:06:56 PDT
Comment on attachment 396217 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=396217&action=review I think this is going in the right direction (thanks!) but I have a few questions and comment, please check them below 😄️ >> Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:66 >> static size_t lowWatermarkPages(FILE* zoneInfoFile) > > See notes in comment #1 from 210345 https://bugs.webkit.org/show_bug.cgi?id=210345#c1. Probably we want just to remove this code because memAvailable is already in /proc/meminfo since Linux 3.14 LTS and this is already EOL since 2016 so there is no reason to maintain this code. We would like to decide whether we want to keep supporting kernels older than 3.14 — but I think it's better to play safe and keep the code for now. If (or when) we decide to remove it, we can use a new bug for that. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:77 > + r = fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer); Well, 128 bytes sounds a bit too much for the zone names, but at the same time it's not that much… so let's just leave it as is 😉️ Note that you should use “%127[^\n]” because fscanf() will read at most 127 characters and *then* add the '\0' terminator, which is character number 128. When using fscanf() I find it clearer to use something like: char buffer[128 + 1]; // +1 makes it clearer that we need space for \0 fscanf(f, "%128s", buffer); (Actually, you did just this below!) > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:80 > + r = fscanf(zoneInfoFile, "%s", buffer); This is unsafe, please indicate a field width corresponding to the size of the buffer as part of the format string: r = fscanf(zoneInfoFile, "%127s", buffer); ^^^ > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { How does this work? The “low” keyword is a few tokens *after* the zone name. Wouldn't you need a nested loop here reading tokens until “low” is seen? > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:168 > + // int scanResult = fscanf(cgroupControllerFile, "%*u:%40[^:]:%4096[^\n]", name, path); Please remove this commented statement.
Pablo Saavedra
Comment 4 2020-04-13 23:27:51 PDT
(In reply to Adrian Perez from comment #3) > Comment on attachment 396217 [details] > patch > > > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { > > How does this work? The “low” keyword is a few tokens *after* the zone name. > Wouldn't you need a nested loop here reading tokens until “low” is seen? > The fscanf() reads the input stream file until the argument list is successfully filled. In the context of the `while (!feof(zoneInfoFile))` loop: - the first `fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer);` iterates the `Node` sections. - Then, when we found a Normal node, we start to read each single `fscanf(zoneInfoFile, "%128s", buffer);` until find the `low` token. - Finally, we read the next token which is the actual `low` value. In this second scanning of the file I read the content one by one because the format of each row is not consistent (2, 3 or 6 values): Node 0, zone Normal pages free 27303 min 20500 low 24089 high 27678 spanned 3414016 present 3414016 managed 3337293 protection: (0, 0, 0, 0, 0)
Pablo Saavedra
Comment 5 2020-04-13 23:32:45 PDT
Adrian Perez
Comment 6 2020-04-17 08:15:26 PDT
(In reply to Pablo Saavedra from comment #4) > (In reply to Adrian Perez from comment #3) > > Comment on attachment 396217 [details] > > patch > > > > > > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:81 > > > + if (r == 1 && inNormalZone && !strcmp(buffer, "low")) { > > > > How does this work? The “low” keyword is a few tokens *after* the zone name. > > Wouldn't you need a nested loop here reading tokens until “low” is seen? > > > > The fscanf() reads the input stream file until the argument list is > successfully filled. In the context of the `while (!feof(zoneInfoFile))` > loop: > > - the first `fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer);` > iterates the `Node` sections. > - Then, when we found a Normal node, we start to read each single > `fscanf(zoneInfoFile, "%128s", buffer);` until find the `low` token. > - Finally, we read the next token which is the actual `low` value. > > In this second scanning of the file I read the content one by one because > the format of each row is not consistent (2, 3 or 6 values): > > Node 0, zone Normal > pages free 27303 > min 20500 > low 24089 > high 27678 > spanned 3414016 > present 3414016 > managed 3337293 > protection: (0, 0, 0, 0, 0) Thanks for the explanation, I get it now, but it's definitely not easy to understand taking a quick look. Knowing that the “low” keyword must be always contained inside a “Node” section, I was expecting a nested loop (the outer one only searching for “Node ... Normal” lines, the inner one picking the value of the “low“ keyword). I guess this is fine as-is but I would add a comment explaining how it works at the beginning of the loop.
Adrian Perez
Comment 7 2020-04-17 08:23:29 PDT
Comment on attachment 396384 [details] patch Other than the hardcoded buffer sizes, the parsing logic looks good, and it's nice that using fscanf() in all this code avoids the need for temporary memory allocations — which is important when there isn't much left :) View in context: https://bugs.webkit.org/attachment.cgi?id=396384&action=review > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:69 > + char buffer[128 + 1]; Please use a constant for the buffer size: #define ZONEINFO_TOKEN_BUFFER_SIZE 128 and then here: char buffer[ZONEINFO_TOKEN_BUFFER_SIZE + 1]; > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:77 > + r = fscanf(zoneInfoFile, " Node %*u, zone %128[^\n]\n", buffer); …then here you can use: int r = fscanf(zoneInfoFile, "Node %*u, zone %" STRINGIFY(ZONEINFO_TOKEN_BUFFER_SIZE) "[^\n]\n", buffer); instead of hardcoding the buffer sizeo in the input format string. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:80 > + r = fscanf(zoneInfoFile, "%128s", buffer); Here you could use fscanf(…, "%" STRINGIFY(ZONEINFO_TOKEN_BUFFER_SIZE) "s", buffer) instead of hardcoding the token buffer length. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:166 > + char name[40 + 1]; Same here, you could define CGROUP_NAME_BUFFER_SIZE. > Source/WebKit/UIProcess/linux/MemoryPressureMonitor.cpp:190 > + char token[50 + 1]; And here too: MEMINFO_TOKEN_BUFFER_SIZE.
Pablo Saavedra
Comment 8 2020-04-18 06:33:32 PDT
EWS
Comment 9 2020-04-18 10:17:17 PDT
Committed r260318: <https://trac.webkit.org/changeset/260318> All reviewed patches have been landed. Closing bug and clearing flags on attachment 396845 [details].
Note You need to log in before you can comment on or make changes to this bug.