When Linux kernel runs on NUMA, each NUMA node has partially separate memory management. There is echo '?' > /proc/sysrq-trigger
function “Will dump current memory info to your console.” of SysRq (implemented as sysrq_handle_showmem
and show_mem
) to get basic memory statistics for every NUMA node to system console, dmesg and system kernel log.
As I understand, there is data printed on memory usage by kernel’s disk cache (page cache) for every NUMA node, probably from active_file:%lu inactive_file:%lu
code of show_free_areas
. (The line cached from free
tool output?)
I want to monitor disk cache usage over numa nodes for long amounts of time with frequent updates; and I want not to fill entire console and dmesg with outputs from SysRq-m
. I plan to find how multi-process or multi-threaded programs (not bound to core or node with affinity) interacts with pagecache pages placed in other node memory.
Is this information (pagecache memory usage per NUMA node) published for program access without using sysrq, by reading and parsing some special files in /proc
or in /sys
? Or is it needed to write new kernel module for this?
free
tool uses /proc/meminfo
to print cache Memory used by the page cache and slabs for entire system; not for every NUMA node. I was unable find per-numa memory stats in http://man7.org/linux/man-pages/man5/proc.5.html man page of proc 5.
There is numastat: https://www.kernel.org/doc/Documentation/numastat.txt but it has no pagecache memory statistics; as I understand it says only about cross-numa page allocation counts, which can be useless when processes are often move between NUMA nodes.
Advertisement
Answer
There are /sys/devices/system/node/nodeX/meminfo
files for every node with basic memory info, for example /sys/devices/system/node/node0/meminfo
for NUMA node 0, /sys/devices/system/node/node1/meminfo
for node 1, etc.
They should be similar to /proc/meminfo
system-wide file format which is actually used by free
utility; its man page has basic description of the meminfo
format: http://man7.org/linux/man-pages/man1/free.1.html
free displays the total amount of free and used physical and swap memory in the system, as well as the buffers and caches used by the kernel. The information is gathered by parsing /proc/meminfo. The displayed columns are: total Total installed memory (MemTotal and SwapTotal in /proc/meminfo) used Used memory (calculated as total - free - buffers - cache) free Unused memory (MemFree and SwapFree in /proc/meminfo) shared Memory used (mostly) by tmpfs (Shmem in /proc/meminfo) buffers Memory used by kernel buffers (Buffers in /proc/meminfo) cache Memory used by the page cache and slabs (Cached and SReclaimable in /proc/meminfo) buff/cache Sum of buffers and cache
meminfo for NUMA is mentioned in https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-devices-node
What: /sys/devices/system/node/nodeX/meminfo Date: October 2002 Contact: Linux Memory Management list <linux-mm@kvack.org> Description: Provides information about the node's distribution and memory utilization. Similar to /proc/meminfo, see Documentation/filesystems/proc.txt
and full meminfo description is in https://www.kernel.org/doc/Documentation/filesystems/proc.txt
You (I) need “Cached” line from numa node meminfo to get information about page cache distribution between NUMA nodes:
Buffers: Relatively temporary storage for raw disk blocks shouldn't get tremendously large (20MB or so) Cached: in-memory cache for files read from the disk (the pagecache). Doesn't include SwapCached SReclaimable: Part of Slab, that might be reclaimed, such as caches
Some parts of used memory can be dirty:
Dirty: Memory which is waiting to get written back to the disk Writeback: Memory which is actively being written back to the disk
It also shows how much memory is used for userspace tasks as anonymous:
AnonPages: Non-file backed pages mapped into userspace page tables AnonHugePages: Non-file backed huge pages mapped into userspace page tables