Skip to content

Python subprocess is not scalable by default, any simple solution you can recommend to make it scalable?

I have an application which does this:

subprocess.Popen(["python3", "-u", ""])

So Python program can start multiple long-lived processes on demand.

If I stop main Python program and start again, it knows that should be started, because there is a record about status in DB that tells it.

So simply it works fine when there is only one replica of Docker container, pod, virtual machine or whatever you call it.

If I scale an app to 3 replicas, subprocess fails to achieve it.

  1. Each Docker container starts, while I want to start on one container

  2. If one container fails, an app should be smart enough to failover to another container

  3. An app should be smart enough to balance subprocesses across containers, for example: ideally should be spread by putting 3 processes per container so in total there are 9 subprocesses running – I don’t need this to be precise, most simplest solution is fine to balance it

I’ve tried to explore RQ (Redis Queue) and similar solutions, but they are heavily focused on tasks, ideally short-living. In my case, they are long-lived processes. E.g. can live for months and years.

The scheme is this:

Main Python app ->,, etc.

Any simple solution exists here without overhead?

Is writing statuses of each sub program to DB an option (also detecting when sub process fails to failover it to another container based on statuses in DB) or would you incorporate additional tool to solve subprocess scaling issue?

Another option is to start on all containers and scale operations inside of it. basically calls some third-party APIs and does some operations based on user preference. So scaling those API calls based on each user preference is complicated, it has multiple threads in background when calling APIs simultaneously. In short, is tied to user1, is tied to user2, etc. So is it worth to make it complex by choosing this option?


If subprocess is used only in standalone apps and nobody tried to implement this mechanism at findable scale on Github, libraries, etc.

How would you solve this issue in Python?

I think about these entries in DB:

ProcessName        ProcessHostname        LastHeartBeat            Enabled
process1            host-0                2022-07-10 15:00        true
process2            null                    null                    true
process3            host-1                2022-07-10 14:50        true

So to solve three points that I wrote above:

  1. Each container tries to pick up process that is not already picked up (where is null or old date of LastHeartBeat). When first container picked up a process, it writes date to LastHeartBeat and then uses subprocess to start a process. Other containers cannot pick up if LastHeartBeat is constantly updated.

  2. If process fails, it doesn’t write LastHeartBeat so other container picks up the process as described in point 1. If failed container cannot reach DB, it stops the operations and restarts (if it’s able to even do exit). If it cannot reach DB, it doesn’t do anything. That is to not run same process twice.

  3. To balance processes across containers, the container which is running less processes can pick up a new one. That info is on DB table to make a decision.

Would you solve differently? Any best practices you recommend?




TL;DR – This is a classical monolithic application scaling problem you can easily solve this by redesigning your application to a microservice architecture, since your application functionality is inherently decoupled between components. Once you’ve done that, it all really boils down to you deploying your application in a natively microservice-friendly fashion and all of your design requirements will be met.

Edit: You’re currently trying to “scale-up” your application in a micro-service system (multiple processes/containers in 1 pod), which defeats the whole purpose of using it. You will have to stick with 1 subprocess <===> 1 pod for the design to really work. Otherwise, you are only introducing immense complications and this is against many design principles of micro services. More details below, if you’re interested.

Let me first summarise all that you’ve said so we can coherently discuss the design.

Application Requirements

As I understand the requirements of your application from all the information you’ve provided, the following is true:

  1. You want your processes to be long-lived.
  2. You have a parent application that spawns these long-lived processes.
  3. These processes need to be started on-demand. (dynamically scaled – and scaled-out; see(7) below for 1 process per container)
  4. If there is no load, your application should spawn just 1 process with
  5. If a container fails, you would like your application to be able to intelligently switch traffic to a healthy container that is also running your long-lived process.
  6. The application should be able to shared load across all the processes/containers currently running.
  7. A process is tied to user requests since it makes calls to 3rd party API systems in order to perform its function. So, it is favourable to have just one process inside a container for simplicity of design.

Limitations of the Current Design(s)

The Current Application Design

Currently, you have the application setup in a way that:

  1. You have one application process that spawns multiple identical processes through the application process.
  2. The application faces the user, and receives requests, and spawns processes as needed and scales well inside one compute unit (container, VM etc.)
  3. These processes then perform their actions, and return the response to the application which return it to the user.

Now let’s discuss your current approach(es) that you’ve mentioned above and see what are the challenges that you’ve described.

Scaling Design 1 – Simply Scaling Docker Containers

This means simply creating more containers for your applications. And we know that it doesn’t satisfy the requirements because scaling the application to multiple replicas starts all the processes and makes them active. This is not what you want, so there is no relationship between these replicas in different containers (since the sub-processes are tied to application running in each container, not the overall system). This is obviously because application‘s in different containers are unaware of each-other (and more importantly, the sub-processes each are spawning).

So, this fails to meet our requirement (3), (4), (5).

Scaling Design 2 – Use a DB as State Storage

To try and meet (3), (4) and (5) we introduced a database that is central to our overall system and it can keep state data of different processes in our system and how certain containers can be “bound” to processes and manage them. However, this was also known to have certain limitations as you pointed out (plus my own thoughts):

  1. Such solutions are good for short-lived processes.
  2. We have to introduce a database that is high speed and be able to maintain states at a very quick pace with a possibility of race conditions.
  3. We will have to write a lot of house-keeping code on top of our containers for orchestration that will use this database and some known rules (that you defined as last 3 points) to achieve our goal. Especially an orchestration component that will know when to start containers on-demand. This is highly complicated.
  4. Not only do we have to spawn new processes, we also want to be able to handle failures and automated traffic switching. This will require us to implement a “networking” component that will communicate with our orchestrator and detect failed containers and re-route incoming traffic to healthy ones and restarts the failed containers.
  5. We will also require this networking service to be able to distribute incoming traffic load across all the containers currently in our system.

This fails to meet our requirements (1) and (7) and most importantly THIS IS REINVENTING THE WHEEL!


Proposed Solution

Now let’s see how this entire problem can be re-engineered with minimum effort and we can satisfy all of our requirements.

The Proposed Application Design

I propose that you can very simply detach your application from your processes. This is easy enough to do, since your application is accepting user requests and forwarding them to identical pool of workers which are performing their operation by making 3rd party API calls. Inherently, this maps perfectly on micro-services.

user1 =====>      |===> worker1 => someAPIs
user2 =====>  App |===> worker2 => someAPIs
user2 =====>      |===> worker3 => someAPIs

We can intelligently leverage this. Note that not only are the elements decoupled, but all the workers are performing an identical set of functions (which can result in different output based on use inputs). Essentially you will replace

subprocess.Popen(["python3", "-u", ""])

with an API call to a service that can provide a worker for you, on demand:

output = some_api(my_worker_service, user_input)

This means, your design of the application has been preserved and you’ve simply placed your processes on different systems. So, the application now looks something like this:

user1 =====>                         |===> worker1 => someAPIs
user2 =====>  App ==>worker_service  |===> worker2 => someAPIs
user2 =====>                         |===> worker3 => someAPIs

With this essential component of application redesign in place, let’s revisit our issues from previous designs and see if this helps us and how Kubernetes comes into the picture.

The Proposed Scaling Solution – Enter Kubernetes!

You were absolutely on the right path when you described usage of a database to maintain the state of our entire system and the orchestration logic being able to retrieve status of current containers in our system and make certain decisions. That’s exactly how Kubernetes works!

Let’s see how Kubernetes solves our problems now

  1. processes in Kubernetes can be long lived. So, requirement (1) is met and limitation (1) of our database design is also mitigated.
  2. We introduced a service that will manage all of the worker processes for us. So, requirement (2),satisfied. It will also be able to scale the processes on-demand, so requirement (3) is satisfied. It will also keep a minimum process count of 1 so we don’t spawn unnecessary processes, so requirement (4) is satisfied. It will be intelligent enough to forward traffic only to processes at are healthy. So, requirement (5) is met. It will also load balance traffic across all the processes it governs, so requirement (6) is met. This service will also mitigate limitation (4) and (5) of our second design.
  3. You will be allowed to size your processes as needed, to make sure that you only use the resources needed. So, requirement (7) is met.
  4. It uses a central database called etcd, which stores the state of your entire cluster and keeps it updated at all times and accommodates for race conditions as well (multiple components updating the same information – it simply lets the first one to arrive win and fails the other one, forcing it to retry). We’ve solved problem (2) from our second design.
  5. It comes with logic to orchestrate our processes out of the box so there is no need to write any code. This mitigates limitation (3) of our second design.

So, not only were we able to meet all of our requirements, we were also able to implement the solution we were trying to achieve, without writing any additional code for the orchestration! (You will just have to restructure your program a little and introduce APIs).

How to Implement This

Just note that in the k8s literature the smallest unit of computation is referred to as pod which performs a single function. Architecturally, this is identical to your description of sub-process. So, whenever I talk about ‘Pods’ I simply refer to your sub-processes.

You will take (roughly) the following steps to implement the proposed design.

  1. Rewrite some part of your application to decouple application from, introducing an API between them.
  2. Package into a container image.
  3. Deploy a small Kubernetes cluster.
  4. Create a Deployment using your image and set the min repica count to 1 and max to any number you want, say 10, for auto-scaling.
  5. Expose this Deployment by creating a Service. This is the “worker service” I talked about, and your application will “submit” requests to this service. And it will not have to worry about anything other than simply making a request to an API endpoint, everything else is handled by k8s.
  6. Configure your application to make API calls to this Service.
  7. Test your application and watch it scale up and down!

Now, the way this will function is as follows:

  1. Client makes a request to your application.
  2. application forwards it to the Service API endpoint.
  3. The Service receives the API request and forwards it to one of the Pods that are running your functionality. If multiple requests are received the Service will balance the requests across all the pods that are available. If a pod fails, it will be take “away” from the service by K8s so requests don’t fail.
  4. The pod will perform your functionality and provide the output.
  5. If all the pods in the Service are reaching saturation, the Deployment auto-scaling will trigger and create more pods for the Service and load sharing will resume again (scale-out). If the resource utilisation then reduces, the Deployment will remove certain pods that are not being used anymore and you will be back to 1 pod (scale-in).

If you want, you can put your frontend application into a Deployment and Service as well which will allow you to have an even friendlier cloud-native micro-service architecture. The user will interact with an API of your front-end which will invoke the Service that is managing your workers which will return results.

I hope this helps you and you can appreciate how clearly the micro-service architecture fits into the design pattern you have, and how you can very simply adapt to it and scale your application as much as you want! Not only that, expressing your design this way will also allow you to redesign/manage/test different versions by simply managing a set of YAML manifests (text files) that you can use with Version Control as well!

1 People found this is helpful