I have an application which does this:
subprocess.Popen(["python3", "-u", "sub-program-1.py"])
So Python program can start multiple long-lived processes on demand.
If I stop main Python program and start again, it knows that
sub-program-1.py should be started, because there is a record about status in DB that tells it.
So simply it works fine when there is only one replica of Docker container, pod, virtual machine or whatever you call it.
If I scale an app to 3 replicas,
subprocess fails to achieve it.
Each Docker container starts
sub-program-1.py, while I want to start on one container
If one container fails, an app should be smart enough to failover
sub-program-1.pyto another container
An app should be smart enough to balance subprocesses across containers, for example:
sub-program-9.pyideally should be spread by putting 3 processes per container so in total there are 9 subprocesses running – I don’t need this to be precise, most simplest solution is fine to balance it
I’ve tried to explore RQ (Redis Queue) and similar solutions, but they are heavily focused on tasks, ideally short-living. In my case, they are long-lived processes. E.g.
sub-program-1.py can live for months and years.
The scheme is this:
Main Python app ->
Any simple solution exists here without overhead?
Is writing statuses of each sub program to DB an option (also detecting when sub process fails to failover it to another container based on statuses in DB) or would you incorporate additional tool to solve
subprocess scaling issue?
Another option is to start
sub-program-1.py on all containers and scale operations inside of it.
sub-program-1.py basically calls some third-party APIs and does some operations based on user preference. So scaling those API calls based on each user preference is complicated, it has multiple threads in background when calling APIs simultaneously. In short,
sub-program-1.py is tied to user1,
sub-program-2.py is tied to user2, etc. So is it worth to make it complex by choosing this option?
subprocess is used only in standalone apps and nobody tried to implement this mechanism at findable scale on Github, libraries, etc.
How would you solve this issue in Python?
I think about these entries in DB:
ProcessName ProcessHostname LastHeartBeat Enabled process1 host-0 2022-07-10 15:00 true process2 null null true process3 host-1 2022-07-10 14:50 true
So to solve three points that I wrote above:
Each container tries to pick up process that is not already picked up (where is
nullor old date of
LastHeartBeat). When first container picked up a process, it writes date to
LastHeartBeatand then uses
subprocessto start a process. Other containers cannot pick up if
LastHeartBeatis constantly updated.
If process fails, it doesn’t write
LastHeartBeatso other container picks up the process as described in point 1. If failed container cannot reach DB, it stops the operations and restarts (if it’s able to even do
exit). If it cannot reach DB, it doesn’t do anything. That is to not run same process twice.
To balance processes across containers, the container which is running less processes can pick up a new one. That info is on DB table to make a decision.
Would you solve differently? Any best practices you recommend?
TL;DR – This is a classical monolithic application scaling problem you can easily solve this by redesigning your application to a microservice architecture, since your application functionality is inherently decoupled between components. Once you’ve done that, it all really boils down to you deploying your application in a natively microservice-friendly fashion and all of your design requirements will be met.
Edit: You’re currently trying to “scale-up” your application in a micro-service system (multiple processes/containers in 1 pod), which defeats the whole purpose of using it. You will have to stick with 1 subprocess <===> 1 pod for the design to really work. Otherwise, you are only introducing immense complications and this is against many design principles of micro services. More details below, if you’re interested.
Let me first summarise all that you’ve said so we can coherently discuss the design.
As I understand the requirements of your application from all the information you’ve provided, the following is true:
- You want your
- You have a parent
applicationthat spawns these
processesneed to be started on-demand. (dynamically scaled – and scaled-out; see(7) below for 1
- If there is no load, your
applicationshould spawn just 1
- If a
containerfails, you would like your
applicationto be able to intelligently switch traffic to a healthy
containerthat is also running your long-lived
applicationshould be able to shared load across all the
processis tied to user requests since it makes calls to 3rd party
APIsystems in order to perform its function. So, it is favourable to have just one
processinside a container for simplicity of design.
Limitations of the Current Design(s)
The Current Application Design
Currently, you have the application setup in a way that:
- You have one
applicationprocess that spawns multiple identical
sub-process.pyprocesses through the application process.
applicationfaces the user, and receives requests, and spawns
sub-process.pyprocesses as needed and scales well inside one compute unit (container, VM etc.)
processesthen perform their actions, and return the response to the
applicationwhich return it to the user.
Now let’s discuss your current approach(es) that you’ve mentioned above and see what are the challenges that you’ve described.
Scaling Design 1 – Simply Scaling Docker Containers
This means simply creating more containers for your applications. And we know that it doesn’t satisfy the requirements because scaling the application to multiple replicas starts all the
processes and makes them
active. This is not what you want, so there is no relationship between these replicas in different containers (since the sub-processes are tied to
application running in each container, not the overall system). This is obviously because
application‘s in different containers are unaware of each-other (and more importantly, the sub-processes each are spawning).
So, this fails to meet our requirement (3), (4), (5).
Scaling Design 2 – Use a DB as State Storage
To try and meet (3), (4) and (5) we introduced a
database that is central to our overall system and it can keep state data of different
processes in our system and how certain
containers can be “bound” to processes and manage them. However, this was also known to have certain limitations as you pointed out (plus my own thoughts):
- Such solutions are good for
- We have to introduce a database that is high speed and be able to maintain states at a very quick pace with a possibility of
- We will have to write a lot of house-keeping code on top of our containers for orchestration that will use this
databaseand some known rules (that you defined as last 3 points) to achieve our goal. Especially an orchestration component that will know when to start containers on-demand. This is highly complicated.
- Not only do we have to spawn new processes, we also want to be able to handle failures and automated traffic switching. This will require us to implement a “networking” component that will communicate with our orchestrator and detect failed containers and re-route incoming traffic to healthy ones and restarts the failed containers.
- We will also require this networking service to be able to distribute incoming traffic load across all the containers currently in our system.
This fails to meet our requirements (1) and (7) and most importantly THIS IS REINVENTING THE WHEEL!
LET’S TALK ABOUT KUBERNETES AND WHY IT IS EXACTLY WHAT YOU NEED.
Now let’s see how this entire problem can be re-engineered with minimum effort and we can satisfy all of our requirements.
The Proposed Application Design
I propose that you can very simply detach your
application from your
processes. This is easy enough to do, since your application is accepting user requests and forwarding them to identical pool of workers which are performing their operation by making 3rd party API calls. Inherently, this maps perfectly on micro-services.
user1 =====> |===> worker1 => someAPIs user2 =====> App |===> worker2 => someAPIs user2 =====> |===> worker3 => someAPIs ...
We can intelligently leverage this. Note that not only are the elements decoupled, but all the workers are performing an identical set of functions (which can result in different output based on use inputs). Essentially you will replace
subprocess.Popen(["python3", "-u", "sub-program-1.py"])
with an API call to a service that can provide a worker for you, on demand:
output = some_api(my_worker_service, user_input)
This means, your design of the application has been preserved and you’ve simply placed your processes on different systems. So, the application now looks something like this:
user1 =====> |===> worker1 => someAPIs user2 =====> App ==>worker_service |===> worker2 => someAPIs user2 =====> |===> worker3 => someAPIs ...
With this essential component of application redesign in place, let’s revisit our issues from previous designs and see if this helps us and how Kubernetes comes into the picture.
The Proposed Scaling Solution – Enter Kubernetes!
You were absolutely on the right path when you described usage of a database to maintain the
state of our entire system and the orchestration logic being able to retrieve status of current
containers in our system and make certain decisions. That’s exactly how Kubernetes works!
Let’s see how Kubernetes solves our problems now
processesin Kubernetes can be long lived. So, requirement (1) is met and limitation (1) of our database design is also mitigated.
- We introduced a
servicethat will manage all of the worker
processesfor us. So, requirement (2),satisfied. It will also be able to scale the
processeson-demand, so requirement (3) is satisfied. It will also keep a minimum
1so we don’t spawn unnecessary processes, so requirement (4) is satisfied. It will be intelligent enough to forward traffic only to
healthy. So, requirement (5) is met. It will also load balance traffic across all the
processesit governs, so requirement (6) is met. This service will also mitigate limitation (4) and (5) of our second design.
- You will be allowed to size your
processesas needed, to make sure that you only use the resources needed. So, requirement (7) is met.
- It uses a central database called
etcd, which stores the state of your entire cluster and keeps it updated at all times and accommodates for race conditions as well (multiple components updating the same information – it simply lets the first one to arrive win and fails the other one, forcing it to retry). We’ve solved problem (2) from our second design.
- It comes with logic to orchestrate our
processesout of the box so there is no need to write any code. This mitigates limitation (3) of our second design.
So, not only were we able to meet all of our requirements, we were also able to implement the solution we were trying to achieve, without writing any additional code for the orchestration! (You will just have to restructure your program a little and introduce APIs).
How to Implement This
Just note that in the k8s literature the smallest unit of computation is referred to as
pod which performs a single function. Architecturally, this is identical to your description of
sub-process. So, whenever I talk about ‘Pods’ I simply refer to your
You will take (roughly) the following steps to implement the proposed design.
- Rewrite some part of your application to decouple
sub-process.py, introducing an API between them.
sub-process.pyinto a container image.
- Deploy a small Kubernetes cluster.
- Create a
sub-process.pyimage and set the min repica count to 1 and max to any number you want, say 10, for auto-scaling.
- Expose this
Deploymentby creating a
Service. This is the “worker service” I talked about, and your
applicationwill “submit” requests to this service. And it will not have to worry about anything other than simply making a request to an API endpoint, everything else is handled by k8s.
- Configure your application to make API calls to this
- Test your application and watch it scale up and down!
Now, the way this will function is as follows:
- Client makes a request to your
applicationforwards it to the
Servicereceives the API request and forwards it to one of the
Podsthat are running your
sub-process.pyfunctionality. If multiple requests are received the
Servicewill balance the requests across all the pods that are available. If a pod fails, it will be take “away” from the service by K8s so requests don’t fail.
- The pod will perform your functionality and provide the output.
- If all the pods in the
Serviceare reaching saturation, the
Deploymentauto-scaling will trigger and create more pods for the
Serviceand load sharing will resume again (scale-out). If the resource utilisation then reduces, the
Deploymentwill remove certain pods that are not being used anymore and you will be back to 1 pod (scale-in).
If you want, you can put your
frontendapplication into a
Serviceas well which will allow you to have an even friendlier cloud-native micro-service architecture. The user will interact with an API of your
front-endwhich will invoke the
Servicethat is managing your
sub-process.pyworkers which will return results.
I hope this helps you and you can appreciate how clearly the micro-service architecture fits into the design pattern you have, and how you can very simply adapt to it and scale your application as much as you want! Not only that, expressing your design this way will also allow you to redesign/manage/test different versions by simply managing a set of YAML manifests (text files) that you can use with Version Control as well!