I have an application which does this:
subprocess.Popen(["python3", "-u", "sub-program-1.py"])
So Python program can start multiple long-lived processes on demand.
If I stop main Python program and start again, it knows that sub-program-1.py
should be started, because there is a record about status in DB that tells it.
So simply it works fine when there is only one replica of Docker container, pod, virtual machine or whatever you call it.
If I scale an app to 3 replicas, subprocess
fails to achieve it.
Each Docker container starts
sub-program-1.py
, while I want to start on one containerIf one container fails, an app should be smart enough to failover
sub-program-1.py
to another containerAn app should be smart enough to balance subprocesses across containers, for example:
sub-program-1.py
–sub-program-9.py
ideally should be spread by putting 3 processes per container so in total there are 9 subprocesses running – I don’t need this to be precise, most simplest solution is fine to balance it
I’ve tried to explore RQ (Redis Queue) and similar solutions, but they are heavily focused on tasks, ideally short-living. In my case, they are long-lived processes. E.g. sub-program-1.py
can live for months and years.
The scheme is this:
Main Python app -> sub-program-1.py
, sub-program-2.py
, etc.
Any simple solution exists here without overhead?
Is writing statuses of each sub program to DB an option (also detecting when sub process fails to failover it to another container based on statuses in DB) or would you incorporate additional tool to solve subprocess
scaling issue?
Another option is to start sub-program-1.py
on all containers and scale operations inside of it. sub-program-1.py
basically calls some third-party APIs and does some operations based on user preference. So scaling those API calls based on each user preference is complicated, it has multiple threads in background when calling APIs simultaneously. In short, sub-program-1.py
is tied to user1, sub-program-2.py
is tied to user2, etc. So is it worth to make it complex by choosing this option?
Update
If subprocess
is used only in standalone apps and nobody tried to implement this mechanism at findable scale on Github, libraries, etc.
How would you solve this issue in Python?
I think about these entries in DB:
ProcessName ProcessHostname LastHeartBeat Enabled process1 host-0 2022-07-10 15:00 true process2 null null true process3 host-1 2022-07-10 14:50 true
So to solve three points that I wrote above:
Each container tries to pick up process that is not already picked up (where is
null
or old date ofLastHeartBeat
). When first container picked up a process, it writes date toLastHeartBeat
and then usessubprocess
to start a process. Other containers cannot pick up ifLastHeartBeat
is constantly updated.If process fails, it doesn’t write
LastHeartBeat
so other container picks up the process as described in point 1. If failed container cannot reach DB, it stops the operations and restarts (if it’s able to even doexit
). If it cannot reach DB, it doesn’t do anything. That is to not run same process twice.To balance processes across containers, the container which is running less processes can pick up a new one. That info is on DB table to make a decision.
Would you solve differently? Any best practices you recommend?
Thanks
Advertisement
Answer
TL;DR – This is a classical monolithic application scaling problem you can easily solve this by redesigning your application to a microservice architecture, since your application functionality is inherently decoupled between components. Once you’ve done that, it all really boils down to you deploying your application in a natively microservice-friendly fashion and all of your design requirements will be met.
Edit: You’re currently trying to “scale-up” your application in a micro-service system (multiple processes/containers in 1 pod), which defeats the whole purpose of using it. You will have to stick with 1 subprocess <===> 1 pod for the design to really work. Otherwise, you are only introducing immense complications and this is against many design principles of micro services. More details below, if you’re interested.
Let me first summarise all that you’ve said so we can coherently discuss the design.
Application Requirements
As I understand the requirements of your application from all the information you’ve provided, the following is true:
- You want your
processes
to belong-lived
. - You have a parent
application
that spawns theselong-lived
processes. - These
processes
need to be started on-demand. (dynamically scaled – and scaled-out; see(7) below for 1process
percontainer
) - If there is no load, your
application
should spawn just 1process
withsub-process.py
. - If a
container
fails, you would like yourapplication
to be able to intelligently switch traffic to a healthycontainer
that is also running your long-livedprocess
. - The
application
should be able to shared load across all theprocesses
/containers
currently running. - A
process
is tied to user requests since it makes calls to 3rd partyAPI
systems in order to perform its function. So, it is favourable to have just oneprocess
inside a container for simplicity of design.
Limitations of the Current Design(s)
The Current Application Design
Currently, you have the application setup in a way that:
- You have one
application
process that spawns multiple identicalsub-process.py
processes through the application process. - The
application
faces the user, and receives requests, and spawnssub-process.py
processes as needed and scales well inside one compute unit (container, VM etc.) - These
processes
then perform their actions, and return the response to theapplication
which return it to the user.
Now let’s discuss your current approach(es) that you’ve mentioned above and see what are the challenges that you’ve described.
Scaling Design 1 – Simply Scaling Docker Containers
This means simply creating more containers for your applications. And we know that it doesn’t satisfy the requirements because scaling the application to multiple replicas starts all the processes
and makes them active
. This is not what you want, so there is no relationship between these replicas in different containers (since the sub-processes are tied to application
running in each container, not the overall system). This is obviously because application
‘s in different containers are unaware of each-other (and more importantly, the sub-processes each are spawning).
So, this fails to meet our requirement (3), (4), (5).
Scaling Design 2 – Use a DB as State Storage
To try and meet (3), (4) and (5) we introduced a database
that is central to our overall system and it can keep state data of different processes
in our system and how certain containers
can be “bound” to processes and manage them. However, this was also known to have certain limitations as you pointed out (plus my own thoughts):
- Such solutions are good for
short-lived
processes. - We have to introduce a database that is high speed and be able to maintain states at a very quick pace with a possibility of
race conditions
. - We will have to write a lot of house-keeping code on top of our containers for orchestration that will use this
database
and some known rules (that you defined as last 3 points) to achieve our goal. Especially an orchestration component that will know when to start containers on-demand. This is highly complicated. - Not only do we have to spawn new processes, we also want to be able to handle failures and automated traffic switching. This will require us to implement a “networking” component that will communicate with our orchestrator and detect failed containers and re-route incoming traffic to healthy ones and restarts the failed containers.
- We will also require this networking service to be able to distribute incoming traffic load across all the containers currently in our system.
This fails to meet our requirements (1) and (7) and most importantly THIS IS REINVENTING THE WHEEL!
LET’S TALK ABOUT KUBERNETES AND WHY IT IS EXACTLY WHAT YOU NEED.
Proposed Solution
Now let’s see how this entire problem can be re-engineered with minimum effort and we can satisfy all of our requirements.
The Proposed Application Design
I propose that you can very simply detach your application
from your processes
. This is easy enough to do, since your application is accepting user requests and forwarding them to identical pool of workers which are performing their operation by making 3rd party API calls. Inherently, this maps perfectly on micro-services.
user1 =====> |===> worker1 => someAPIs user2 =====> App |===> worker2 => someAPIs user2 =====> |===> worker3 => someAPIs ...
We can intelligently leverage this. Note that not only are the elements decoupled, but all the workers are performing an identical set of functions (which can result in different output based on use inputs). Essentially you will replace
subprocess.Popen(["python3", "-u", "sub-program-1.py"])
with an API call to a service that can provide a worker for you, on demand:
output = some_api(my_worker_service, user_input)
This means, your design of the application has been preserved and you’ve simply placed your processes on different systems. So, the application now looks something like this:
user1 =====> |===> worker1 => someAPIs user2 =====> App ==>worker_service |===> worker2 => someAPIs user2 =====> |===> worker3 => someAPIs ...
With this essential component of application redesign in place, let’s revisit our issues from previous designs and see if this helps us and how Kubernetes comes into the picture.
The Proposed Scaling Solution – Enter Kubernetes!
You were absolutely on the right path when you described usage of a database to maintain the state
of our entire system and the orchestration logic being able to retrieve status of current containers
in our system and make certain decisions. That’s exactly how Kubernetes works!
Let’s see how Kubernetes solves our problems now
processes
in Kubernetes can be long lived. So, requirement (1) is met and limitation (1) of our database design is also mitigated.- We introduced a
service
that will manage all of the workerprocesses
for us. So, requirement (2),satisfied. It will also be able to scale theprocesses
on-demand, so requirement (3) is satisfied. It will also keep a minimumprocess
count of1
so we don’t spawn unnecessary processes, so requirement (4) is satisfied. It will be intelligent enough to forward traffic only toprocesses
at arehealthy
. So, requirement (5) is met. It will also load balance traffic across all theprocesses
it governs, so requirement (6) is met. This service will also mitigate limitation (4) and (5) of our second design. - You will be allowed to size your
processes
as needed, to make sure that you only use the resources needed. So, requirement (7) is met. - It uses a central database called
etcd
, which stores the state of your entire cluster and keeps it updated at all times and accommodates for race conditions as well (multiple components updating the same information – it simply lets the first one to arrive win and fails the other one, forcing it to retry). We’ve solved problem (2) from our second design. - It comes with logic to orchestrate our
processes
out of the box so there is no need to write any code. This mitigates limitation (3) of our second design.
So, not only were we able to meet all of our requirements, we were also able to implement the solution we were trying to achieve, without writing any additional code for the orchestration! (You will just have to restructure your program a little and introduce APIs).
How to Implement This
Just note that in the k8s literature the smallest unit of computation is referred to as pod
which performs a single function. Architecturally, this is identical to your description of sub-process
. So, whenever I talk about ‘Pods’ I simply refer to your sub-processes
.
You will take (roughly) the following steps to implement the proposed design.
- Rewrite some part of your application to decouple
application
fromsub-process.py
, introducing an API between them. - Package
sub-process.py
into a container image. - Deploy a small Kubernetes cluster.
- Create a
Deployment
using yoursub-process.py
image and set the min repica count to 1 and max to any number you want, say 10, for auto-scaling. - Expose this
Deployment
by creating aService
. This is the “worker service” I talked about, and yourapplication
will “submit” requests to this service. And it will not have to worry about anything other than simply making a request to an API endpoint, everything else is handled by k8s. - Configure your application to make API calls to this
Service
. - Test your application and watch it scale up and down!
Now, the way this will function is as follows:
- Client makes a request to your
application
. application
forwards it to theService
API endpoint.- The
Service
receives the API request and forwards it to one of thePods
that are running yoursub-process.py
functionality. If multiple requests are received theService
will balance the requests across all the pods that are available. If a pod fails, it will be take “away” from the service by K8s so requests don’t fail. - The pod will perform your functionality and provide the output.
- If all the pods in the
Service
are reaching saturation, theDeployment
auto-scaling will trigger and create more pods for theService
and load sharing will resume again (scale-out). If the resource utilisation then reduces, theDeployment
will remove certain pods that are not being used anymore and you will be back to 1 pod (scale-in).
If you want, you can put your
frontend
application into aDeployment
andService
as well which will allow you to have an even friendlier cloud-native micro-service architecture. The user will interact with an API of yourfront-end
which will invoke theService
that is managing yoursub-process.py
workers which will return results.
I hope this helps you and you can appreciate how clearly the micro-service architecture fits into the design pattern you have, and how you can very simply adapt to it and scale your application as much as you want! Not only that, expressing your design this way will also allow you to redesign/manage/test different versions by simply managing a set of YAML manifests (text files) that you can use with Version Control as well!