I write three different codes to copy data from a 4GB buffer to another 4GB buffer. I measure their bandwidth and the cache miss with perf stat. The code is shown below: Compiling it with gcc memcpy-test.c -o memcpy-test. The first one uses memcpy to copy memcpy_sz bytes data for each time. I test this with 8B, 64B, 4KB, 512KB,
Tag: memory
python ProcessPoolExecutor memory problems
This is in Linux, Python 3.8. I use ProcessPoolExecutor to speed up the processing of a list of large dataframes, but because they all get copied in each process, I run out of memory. How do I solve this problem? My code looks like this: I want to minimize the unnecessary copying of data, i.e. minimize my memory footprint. What’s
understand sysstat sar memory output
I’m preparing for more traffic in the days to come, and I want to be sure server can handle it. Running sar -q, the load of “3.5” doesn’t seem much on 32 CPU architecture: However, I’m not sure about the memory. Running sar -r shows 98.5% for the %memused and only 13.60 for %commit: running htop seems OK too: 14.9G/126G.
Is there a different memory allocation path other than the buddy allocator in linux?
I’m understanding memory allocation in Linux and doing some changes in buddy allocator (__alloc_pages_nodemask) for my experiments. I create a new flag in struct page->flags (by adding a new flag in enum pageflags in page-flags.h. I set this bit permanently in __alloc_pages_nodemask (to not to be cleared once set and survive all further allocation and freeing. I modify PAGE_FLAGS_CHECK_AT_PREP to
Is it possible to add a customized name for the (non file-backed) mmap region?
Just curious whether it is possible to specify a name for the non file-backed mmap region? Some thing like the [New VMA area] in the following example: Answer The content of maps comes from the show_map_vma function in fs/proc/task_mmu.c. Looking at it, if you want a custom name for a non-file-backed mapping, it’d need to come from either vma->vm_ops->name or
How does OS kernel get notified when memory is accessed?
As far as I know, OS kernel maintains the translation from virtual address to physical address, and the userspace program uses virtual address, the CPU uses physical address. Since all machine codes are executed by CPU, how does OS kernel know a memory access instruction is taken, and translate the virtual address to physical address? CPU can execute a syscall
What would be a good algorithm for patching binary files on Linux?
I am trying to reduce data transfer to my embedded Linux device by creating patches for almost similar binaries. I have memory constraints on my device and hence heavy algorithms like bsdiff and bspatch are unaffordable on my target for binary sizes of around 36-60 MB. I would like to know the commands that have the best algorithms for diffing and
What is using so much memory on an idle linux server? Comparing output of “htop” and “ps aux”
I am trying to understand and compare the output I see from htop (sorted by mem%) and “ps aux –sort=-%mem | grep query.jar” and determine why 24.2G out of 32.3G is in use on an idle server. The ps command shows a single parent (not child process I assume): Whereas htop shows PID 6790 as well as many other PIDs
Running address of an application, followed by heap and stack expansions
I have an m.c: and an a.c: I compile and build these as: Then, I examine the executable, linux thus: objdump -drwxCS -Mintel linux The output of this on my Ubuntu 16.04.6 starts off with: start address 0x0000000000400540 then, later, is the init section: Finally, is the fini section: The program references the string Hello , world!n which is in
Where is the memusage command in Ubuntu? [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question