If you’re here:
https://github.com/torvalds/linux/blob/master/fs/ext4/file.c#L360
You have access to these two structs inside the ext4_file_mmap
function:
struct file *file, struct vm_area_struct *vma
I am changing the implementation of this function for dax
mode so that the page tables get entirely filled out for the file the moment you call mmap
(to see how much better performance not taking any pagefaults
gives us).
I have managed to get the following done so far (assuming I have access to to the two structs that ext4_file_mmap
has access to):
// vm_area_struct defined in /include/linux/mm_types.h : 284 // file defined in /include/linux/fs.h : 848 loff_t file_size = file_inode(file)->i_size; unsigned long start_va = vma->vm_start;
Now, the difficulty lies here. How do I get the physical addresses (blocks? Not sure if dax
uses blocks) associated with this file?
I have spent the last couple of days staring at the linux source code, trying to make sense of stuff, and boy have I been successful.
Any help, hint,or suggestion is greatly appreciated! Thanks!
Some updates: When you mmap
a file in dax
mode, you don’t fetch anything into memory. The device, in this case PMEM, is byte-addressable and gives DDR latencies, so it’s accessed directly (no memory in between). Certain pte
s lead to the access of this PMEM device instead of memory.
Advertisement
Answer
First of all mmap support MAP_POPULATE flag specifically to avoid page faults. In principle it may be it does not work with dax, but that’s unlikely.
Second of all it seems you don’t have any measurements of the current state of affairs. Just “changing something and checking the difference” is a fundamentally wrong approach. In particular it may be the actual bottleneck will be removed as an unintended consequence of the change and the win will end up being misattributed. You can start by using ‘perf’ to get basic numbers and generating flamegraphs ( http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html ). If you do a lot of i/o over a small range, page faults should have a negligible effect.