How does the memory behind statically allocated huge pages get distributed across NUMA nodes?

Question

In my /etc/default/grub file I have explicitly set aside N huge pages with "hugepages=N". If I'm running on a box with 2 NUMA nodes, does N/2 huge pages get set aside for each node, or do they all go to node 0, or ....? Also is there a way on the command line to query how they're split across nodes?

Accepted Answer

From the kernel.org post about hugepages HEREOn a NUMA platform, the kernel will attempt to distribute the huge page poolover all the set of allowed nodes specified by the NUMA memory policy of thetask that modifies nr_hugepages.  The default for the allowed nodes&#8211;when thetask has default memory policy&#8211;is all on-line nodes with memory.  Allowednodes with insufficient available, contiguous memory for a huge page will besilently skipped when allocating persistent huge pages.  See the discussionbelow of the interaction of task memory policy, cpusets and per node attributeswith the allocation and freeing of persistent huge pages.The success or failure of huge page allocation depends on the amount ofphysically contiguous memory that is present in system at the time of theallocation attempt.  If the kernel is unable to allocate huge pages fromsome nodes in a NUMA system, it will attempt to make up the difference byallocating extra pages on other nodes with sufficient available contiguousmemory, if any.System administrators may want to put this command in one of the local rcinit files.  This will enable the kernel to allocate huge pages early inthe boot process when the possibility of getting physical contiguous pagesis still very high.  Administrators can verify the number of huge pagesactually allocated by checking the sysctl or meminfo.  To check the per nodedistribution of huge pages in a NUMA system, use:cat /sys/devices/system/node/node*/meminfo | fgrep Huge

Advertisement

Answer