This is in Linux, Python 3.8. I use ProcessPoolExecutor to speed up the processing of a list of large dataframes, but because they all get copied in each process, I run out of memory. How do I solve this problem? My code looks like this: I want to minimize the unnecessary copying of data, i.e. minimize my memory footprint. What’s
Tag: multiprocessing
libuv: difference between fork and uv_spawn?
Recently I have been playing around with Libuv. I don’t get the programming model as far as child processes are concerned. For example look at the following code: Here the output printed on console is: In my understanding uv_spawn acts like fork(). In child process the value of r is 0 and in parent process it is non-zero. So from
Is there a way to assign different jobs (processes) to specific core in linux using python?
I have python functions that should run parallelly in a linux environment utilizing multi cores. Is there a way to specify which core should be used for each process explicitly? Currently, I am using python multiprocessing module to run these python functions as parallel processes in 4 cores. Answer Possibly with the combination of os.sched_setaffinity and os.sched_getaffinity. The docstring says:
How to rewrite this multiprocessing code for Windows?
I’m currently using multiprocessing so I can obtain user input while running other code. This version of code runs on ubuntu 19.04 for me, but for my friend it doesn’t work on windows. How can I make this code work on windows? Also the user input lags behind by one input. If the user presses a button it only prints
C fork and pipe multiple process
I’m trying to implement this command cat /etc/passwd | grep 1000 | cut -d: -f1 in C using system calls fork and pipe. When I use only two commands cmd1 | cmd2 so 1 fork it works fine but when I’m using more than 2 process the problem occurs Here is my code, I think the problem is in the
Why multiprocessing.Pool and multiprocessing.Process perform so differently in Linux
I ran some test code as below to check the performance of using Pool and Process in Linux. I’m using Python 2.7. The source code of multiprocessing.Pool seems showing it’s using multiprocessing.Process. However, multiprocessing.Pool cost much time and mem than equal # of multiprocessing.Process, and I don’t get this. Here is what I did: Create a large dict and then
How to detect and find out a program is in deadlock?
This is an interview question. How to detect and find out if a program is in deadlock? Are there some tools that can be used to do that on Linux/Unix systems? My idea: If a program makes no progress and its status is running, it is deadlock. But, other reasons can also cause this problem. Open source tools are valgrind