This is a really ugly question.
I have a C++ program which does the following in a loop:
- Waits for a JMS message
- Calculates some data
- Sends a JMS message in response
My program (let’s call it “Bob”) has a rather severe memory leak. The memory leak is located in a shared library that someone else wrote, which I must use, but the source code to which I do not have access.
This memory leak causes Bob to crash during the “calculates some data” phase of the loop. This is a problem, because another program is awaiting Bob’s response, and will be very upset if it does not receive one.
Due to various restrictions (yes, this is an X/Y problem, I told you it was ugly), I have determined that my only viable strategy is to modify Bob so that it does the following in its loop:
- Waits for a JMS message
- Calculates some data
- Sends a JMS message in response
- Checks to see whether it’s in danger of using “too much” memory
- If so, forks and execs another copy of itself, and gracefully exits
My question is as follows:
What is the best (reliable but not too inefficient) way to detect whether we’re using “too much” memory? My current thought is to compare getrlimit(RLIMIT_AS) rlim_cur
to getrusage(RUSAGE_SELF) ru_maxrss
; is that correct? If not, what’s a better way? Bob runs in a Linux VM on various host machines, all with different amounts of memory.
Advertisement
Answer
Assuming the memory leak occurs in the “Calculates some data” phase, I think it might make more sense to just refactor that portion into a separate program and fork out to execute that in its own process. That way you can at least isolate the offending code and make it easier to replace it in the future, rather than just masking the problem by having the program restart itself when it runs low on memory.
The “Calculates some data” part can either be a long-running process that waits for requests from the main program and restarts itself when necessary, or (even simpler) it could be a one-and-done program that just takes its data in *argv
and sends its results to stdout
. Then your main loop can just fork out and exec it every time through, and read the results when they come back. I would go with the simpler option if possible, but that will of course depend on what your needs are.