I want to write a function computeWriteSet
that takes an arbitrary function f
as an argument and (1) executes the function f
and (2) returns the set of places modified or written to (addresses/pages/objects) during f
‘s execution.
writeset computeWriteSet(function f) { writeset ws = createEmptyWriteset(); // prepare to execute f startRecordingWrites(&ws); f(); stopRecordingWrites(&ws); // post-process the write-set return ws; }
- What options exist for implementing it?
- What are their tradeoffs (in which case which implementation is more efficient and what are the limitations?)
Notes
The function is specified at runtime and can do anything (i.e. can contain any set of instructions, including loops, branching and function/system calls.
All writes from the time f
is called until it returns should be recorded (this includes functions called from within f
itself). For simplicity, let’s assume computeWriteSet
is not called from within.
OS-specific tricks are allowed (and probably required). I’m particularly interested in Linux, ideally within userspace.
Example
static int x = 0; static int y = 0; static int z = 0; void a() { if (y) z++; if (x) y++; x = (x + 1) % 2; } int main() { computeWriteSet(a); // returns { &x } => {x,y,z} = {1, 0, 0} computeWriteSet(a); // returns { &x, &y } => {x,y,z} = {0, 1, 0} computeWriteSet(a); // returns { &x, &z } => {x,y,z} = {1, 1, 1} return 0; }
Expected Output
The output should be the set of changes. This can be either the set of pages:
{ <address of x>, <address of y>, …}
Or the set of memory addresses:
{<page of x and y>, <page of z>, …}
Or the set of objects ( (based on interposition of allocation functions)
x = malloc(100) // returns address 0xAAA y = malloc(200) // returns address 0xBBB … { {address, size}, {0xAAA, 100}, {0xBBB, 200}, … }
The return value is loosely specified on purpose — different techniques will have different spatial resolution and different overheads.
Please note:
This is a highly uncommon programming question, hence if you think it should be closed let me know why and, ideally, how to phrase/place it so that it follows the guidelines. 🙂
Advertisement
Answer
As suggested by @Barmar, one way of accomplishing this is via mprotect
.
This will generate one exception for each of the memory pages, which may add considerable overhead depending on the function. This exception is handled by us and we then insert the corresponding address into a set.
A small 100-line C++
/C
fully-contained messy program showcasing this is included below.
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <signal.h> #include <ucontext.h> #include <fcntl.h> #include <execinfo.h> #include <sys/mman.h> #include <set> #include <functional> #include <cassert> extern "C" { extern int __data_start; extern int _end; } #define PAGE_SIZE sysconf(_SC_PAGESIZE) #define PAGE_MASK (PAGE_SIZE - 1) #define PAGE_ALIGN_DOWN(x) (((intptr_t) (x)) & ~PAGE_MASK) #define PAGE_ALIGN_UP(x) ((((intptr_t) (x)) + PAGE_MASK) & ~PAGE_MASK) #define GLOBALS_START PAGE_ALIGN_DOWN((intptr_t) &__data_start) #define GLOBALS_END PAGE_ALIGN_UP((intptr_t) &_end - 1) #define GLOBALS_SIZE (GLOBALS_END - GLOBALS_START) std::set<void*> *addresses = new std::set<void*>(); void sighandler(int signum, siginfo_t *siginfo, void *ctx) { void *addr = siginfo->si_addr; void *aligned_addr = reinterpret_cast<void*>(PAGE_ALIGN_DOWN(addr)); switch(siginfo->si_code) { case SEGV_ACCERR: mprotect(aligned_addr, PAGE_SIZE, PROT_READ | PROT_WRITE); addresses->insert(aligned_addr); break; default: exit(-1); } } void computeWriteSet(std::function<void()> f) { static bool initialized = false; if (!initialized) { // install signal handler stack_t sigstk; sigstk.ss_sp = malloc(SIGSTKSZ); sigstk.ss_size = SIGSTKSZ; sigstk.ss_flags = 0; sigaltstack(&sigstk, NULL); struct sigaction siga; sigemptyset(&siga.sa_mask); sigaddset(&siga.sa_mask, SIGSEGV); sigprocmask(SIG_BLOCK, &siga.sa_mask, NULL); siga.sa_flags = SA_SIGINFO | SA_ONSTACK | SA_RESTART | SA_NODEFER; siga.sa_sigaction = sighandler; sigaction(SIGSEGV, &siga, NULL); sigprocmask(SIG_UNBLOCK, &siga.sa_mask, NULL); initialized = true; } addresses->clear(); printf("nexecuting functionn"); printf("--------------n"); mprotect(reinterpret_cast<void*>(GLOBALS_START), GLOBALS_SIZE, PROT_READ); f(); mprotect(reinterpret_cast<void*>(GLOBALS_START), GLOBALS_SIZE, PROT_READ | PROT_WRITE); printf("--------------n"); printf("pages written:n"); for (auto addr : *addresses) { printf("%pn", addr); } } void f() { static int x[1024] = {0}; static int y[1024] = {0}; static int z[1024] = {0}; static bool firsttime = true; if (firsttime) { printf("&x[0] = %pn&y[0] = %pn&z[0] = %pn", x, y, z); firsttime = false; } if (y[0]) z[0]++; if (x[0]) y[0]++; x[0] = (x[0] + 1) % 2; printf("{x, y, z} = {%d, %d, %d}n", x[0], y[0], z[0]); } int main() { computeWriteSet(f); computeWriteSet(f); computeWriteSet(f); return 0; }
Compiled using g++ --std=c++11 example.cpp
.
Execution prints something like:
executing function -------------- &x[0] = 0x6041c0 &y[0] = 0x6051c0 &z[0] = 0x6061c0 {x, y, z} = {1, 0, 0} -------------- pages written: 0x604000 executing function -------------- {x, y, z} = {0, 1, 0} -------------- pages written: 0x604000 0x605000 executing function -------------- {x, y, z} = {1, 1, 1} -------------- pages written: 0x604000 0x606000
Some notes:
We make
x
,y
andz
large-enough arrays (of sizePAGE_SIZE/sizeof(int)
, which is 1024 on my machine) so that they fall in different memory pages and therefore can be differentiated.This program will only work for global/static variables so that it can be short-ish. To extend it to work with heaps and other memory mappings can be done via interpositions, as suggested by @AShelly.
On-topic follow-up: Is there any way of avoiding the O(N)
signals, where N
is the number of pages written to?