I have never used Valgrind
, but I think this tool can help me with my question. I would be grateful for any help.
In my R
code, I use the MixedModels
Julia
package.
I integrate Julia
in R
using the JuliaCall
package.
I work with very large datasets (~1 GB
, ~4x10^6
observations) and at the modeling step (mixed models
) a lot of RAM is allocated (~130 GB
), most of it does not return to the system after the end of calculations.
I would like to analyze the code and see the whole stack of R
and Julia
functions.
It is very important for me to understand what functions are called up during mixed models
calculation with Julia
(especially low-level functions, most likely written in C / C ++
), and how much memory each of these functions utilize.
It is also important to understand what exactly the memory is spent on, what exactly happens in the RAM
when the functions from the MixedModels
package are running.
Perhaps understanding this will help me improve the performance of the code and reduce the memory allocation.
Maybe for my tasks some other tool (rather than Valgrind
) will be more useful – I will be very grateful for the relevant recommendations!
Advertisement
Answer
As an example of valgrind --tool=massif
, using Git 2.38 (Q3 2022) (so no r or Julia related, but just as an illustration)
See commit 51d1b69 (26 Jul 2022) by Jeff King (peff
).
See commit 068fa54, commit 90b2bb7, commit 5766524 (19 Jul 2022) by Derrick Stolee (derrickstolee
).
(Merged by Junio C Hamano — gitster
— in commit acbec18, 03 Aug 2022)
The codepath to write multi-pack
index (introduced here) has been taught to release a large chunk of memory that holds an array of objects in the packs, as soon as it is done with the array, to reduce memory consumption.
midx
: reduce memory pressure while writing bitmapsSigned-off-by: Derrick Stolee
We noticed that some ‘
git multi-pack-index write
‘(man)--bitmap
processes were running with very high memory.
It turns out that a lot of this memory is required to store a list of every object in the written multi-pack-index, with a second copy that has additional information used for the bitmap writing logic.Using ‘
valgrind --tool=massif
‘ before this change, the following chart shows how memory load increased and was maintained throughout the process:GB ^ 4.102 :: | @ @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: : | :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : +--------------------------------------------------------------->It turns out that the ‘struct
write_midx_context
‘ data is persisting through the life of the process, including the ‘entries’ array.
This array is used last insidefind_commits_for_midx_bitmap()
withinwrite_midx_bitmap()
.If we free (and nullify) the array at that point, we can free a decent chunk of memory before the bitmap logic adds more to the memory footprint.
Here is the massif memory load chart after this change:
GB ^ 3.111# | # :::::::::::@::::::::::::::@ | # ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@ | @# :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ +--------------------------------------------------------------->The previous change introduced a refactoring of
write_midx_bitmap()
to make it more clear how much of the ‘structwrite_midx_context
‘ instance is needed at different parts of the process.
In addition, the following defensive programming measures were put in place:
- Using
FREE_AND_NULL()
we will at least get a segfault from reading aNULL
pointer instead of a use-after-free.- ‘
entries_nr
‘ is also set to zero to make any loop that would iterate over the entries be trivial.- Add significant comments in
write_midx_internal()
to add warnings for future authors who might accidentally add references to this cleared memory.
Note that valgrind --tool=massif
, as the [documentation mentions][3 measures only heap memory, i.e. memory allocated with malloc
, calloc
, realloc
, memalign
, new
, new[]
, and a few other, similar functions.
This means it does not directly measure memory allocated with lower-level system calls such as mmap
, mremap
, and brk
.
See more with “What is the difference between ‘time -f "%M"
‘ and ‘valgrind --tool=massif
‘?“.