I have python functions that should run parallelly in a linux environment utilizing multi cores. Is there a way to specify which core should be used for each process explicitly?
Currently, I am using python multiprocessing module to run these python functions as parallel processes in 4 cores.
import multiprocessing as mp def hello(name, msg): try: print("Hello {}".format(name)) print(msg) return true except Exception: return False pool = mp.Pool(mp.cpu_count()) msg = "It's a holiday!" name_list = ["A", "B", "C"] hello_status = pool.starmap(hello,[(name, msg) for name in name_list]) print(hello_status)
Advertisement
Answer
Possibly with the combination of os.sched_setaffinity
and os.sched_getaffinity
. The docstring says:
Signature: os.sched_setaffinity(pid, mask, /)
Docstring:
Set the CPU affinity of the process identified by pid to mask.
mask should be an iterable of integers identifying CPUs.
Type: builtin_function_or_method
I couldn’t find Python specific documentation, but these man pages should be a good first source of information.
UPDATE:
I decided to look into the multiprocessing
module and cook up a working example. I came up with two ways to do it using multiprocessing.Pool
class. First method is by passing an initializer
argument to the Pool
constructor call. The second is by using a Pool.map
function.
import os from itertools import cycle import multiprocessing as mp def _myinit(): my_pid = os.getpid() old_aff = os.sched_getaffinity(0) os.sched_setaffinity(0, [0, 3]) new_aff = os.sched_getaffinity(0) print("My pid is {} and my old aff was {}, my new aff is {}".format(my_pid, old_aff, new_aff)) def map_hack(AFF): my_pid = os.getpid() old_aff = os.sched_getaffinity(0) os.sched_setaffinity(0, AFF) return (my_pid, old_aff, os.sched_getaffinity(0)) PROCESSES = os.cpu_count() # just an example iterable you could use for the map_hack # elements of cpus must be iterables, because of os.sched_setaffinity _mycpus = cycle(os.sched_getaffinity(0)) cpus = [[next(_mycpus)] for x in range(PROCESSES)] # Since Python 3.3 context managers are supported for mp.Pool # using initializer argument to change affinity with mp.Pool(processes=PROCESSES, initializer=_myinit) as pool: # do something conditional on your affinity pool.close() pool.join() print("") # using mp.Pool.map hack to change affinity with mp.Pool(processes=PROCESSES) as pool: for x in pool.map(map_hack, cpus, chunksize=1): print("My pid is {} and my old aff was {}, my new aff is {}".format(*x)) # do something conditional on your affinity pool.close() pool.join()
Notice that using initializer
I hardcoded the affinity of all processes for the first and forth CPUs (0, 3), but that’s just because I found it a bit trickier to use cycle
like I did with map_hack
. I also wanted to demonstrate that you can set the affinity for any (legal) number of cpus.
I suggest you to through the code and make sure to understand it by reading the relevant docs and playing with it by changing some parameters. It should go without saying that all the print
statements are only there for us to convince ourselves that the methods are working.
Finally, if you’re after more control I’d suggest you to use mp.Process
objects instead of mp.Pool
. The same tools from os
should come in handy there as well.
WINDOWS:
This will not work if you’re using Windows. From the docs:
These functions control how a process is allocated CPU time by the operating system. They are only available on some Unix platforms. For more detailed information, consult your Unix manpages.
In this case you could look into win32process
, specifically win32process.SetProcessAffinityMask
and win32process.GetProcessAffinityMask
, see here.