Python subprocess is not scalable by default, any simple solution you can recommend to make it scalable?

Question

I have an application which does this: So Python program can start multiple long-lived processes on demand. If I stop main Python program and start again, it knows that sub-program-1.py should be started, because there is a record about status in DB that tells it. So simply it works fine when there is only one replica of Docker container, pod,

Accepted Answer

TL;DR &#8211; This is a classical monolithic application scaling problem you can easily solve this by redesigning your application to a microservice architecture, since your application functionality is inherently decoupled between components. Once you&#8217;ve done that, it all really boils down to you deploying your application in a natively microservice-friendly fashion and all of your design requirements will be met.Edit: You&#8217;re currently trying to &#8220;scale-up&#8221; your application in a micro-service system (multiple processes/containers in 1 pod), which defeats the whole purpose of using it. You will have to stick with 1 subprocess <===> 1 pod for the design to really work. Otherwise, you are only introducing immense complications and this is against many design principles of micro services. More details below, if you&#8217;re interested.Let me first summarise all that you&#8217;ve said so we can coherently discuss the design.Application RequirementsAs I understand the requirements of your application from all the information you&#8217;ve provided, the following is true:You want your processes to be long-lived.You have a parent application that spawns these long-lived processes.These processes need to be started on-demand. (dynamically scaled &#8211; and scaled-out; see(7) below for 1 process per container)If there is no load, your application should spawn just 1 process with sub-process.py.If a container fails, you would like your application to be able to intelligently switch traffic to a healthy container that is also running your long-lived process.The application should be able to shared load across all the processes/containers currently running.A process is tied to user requests since it makes calls to 3rd party API systems in order to perform its function. So, it is favourable to have just one process inside a container for simplicity of design.Limitations of the Current Design(s)The Current Application DesignCurrently, you have the application setup in a way that:You have one application process that spawns multiple identical sub-process.py processes through the application process.The application faces the user, and receives requests, and spawns sub-process.py processes as needed and scales well inside one compute unit (container, VM etc.)These processes then perform their actions, and return the response to the application which return it to the user.Now let&#8217;s discuss your current approach(es) that you&#8217;ve mentioned above and see what are the challenges that you&#8217;ve described.Scaling Design 1 &#8211; Simply Scaling Docker ContainersThis means simply creating more containers for your applications. And we know that it doesn&#8217;t satisfy the requirements because scaling the application to multiple replicas starts all the processes and makes them active. This is not what you want, so there is no relationship between these replicas in different containers (since the sub-processes are tied to application running in each container, not the overall system). This is obviously because application&#8216;s in different containers are unaware of each-other (and more importantly, the sub-processes each are spawning).So, this fails to meet our requirement (3), (4), (5).Scaling Design 2 &#8211; Use a DB as State StorageTo try and meet (3), (4) and (5) we introduced a database that is central to our overall system and it can keep state data of different processes in our system and how certain containers can be &#8220;bound&#8221; to processes and manage them. However, this was also known to have certain limitations as you pointed out (plus my own thoughts):Such solutions are good for short-lived processes.We have to introduce a database that is high speed and be able to maintain states at a very quick pace with a possibility of race conditions.We will have to write a lot of house-keeping code on top of our containers for orchestration that will use this database and some known rules (that you defined as last 3 points) to achieve our goal. Especially an orchestration component that will know when to start containers on-demand. This is highly complicated.Not only do we have to spawn new processes, we also want to be able to handle failures and automated traffic switching. This will require us to implement a &#8220;networking&#8221; component that will communicate with our orchestrator and detect failed containers and re-route incoming traffic to healthy ones and restarts the failed containers.We will also require this networking service to be able to distribute incoming traffic load across all the containers currently in our system.This fails to meet our requirements (1) and (7) and most importantly THIS IS REINVENTING THE WHEEL!LET&#8217;S TALK ABOUT KUBERNETES AND WHY IT IS EXACTLY WHAT YOU NEED.Proposed SolutionNow let&#8217;s see how this entire problem can be re-engineered with minimum effort and we can satisfy all of our requirements.The Proposed Application DesignI propose that you can very simply detach your application from your processes. This is easy enough to do, since your application is accepting user requests and forwarding them to identical pool of workers which are performing their operation by making 3rd party API calls. Inherently, this maps perfectly on micro-services.user1 =====>      |===> worker1 => someAPIsuser2 =====>  App |===> worker2 => someAPIsuser2 =====>      |===> worker3 => someAPIs    ...We can intelligently leverage this. Note that not only are the elements decoupled, but all the workers are performing an identical set of functions (which can result in different output based on use inputs). Essentially you will replacesubprocess.Popen(["python3", "-u", "sub-program-1.py"])with an API call to a service that can provide a worker for you, on demand:output = some_api(my_worker_service, user_input)This means, your design of the application has been preserved and you&#8217;ve simply placed your processes on different systems. So, the application now looks something like this:user1 =====>                         |===> worker1 => someAPIsuser2 =====>  App ==>worker_service  |===> worker2 => someAPIsuser2 =====>                         |===> worker3 => someAPIs    ...With this essential component of application redesign in place, let&#8217;s revisit our issues from previous designs and see if this helps us and how Kubernetes comes into the picture.The Proposed Scaling Solution &#8211; Enter Kubernetes!You were absolutely on the right path when you described usage of a database to maintain the state of our entire system and the orchestration logic being able to retrieve status of current containers in our system and make certain decisions. That&#8217;s exactly how Kubernetes works!Let&#8217;s see how Kubernetes solves our problems nowprocesses in Kubernetes can be long lived. So, requirement (1) is met and limitation (1) of our database design is also mitigated.We introduced a service that will manage all of the worker processes for us. So, requirement (2),satisfied. It will also be able to scale the processes on-demand, so requirement (3) is satisfied. It will also keep a minimum process count of 1 so we don&#8217;t spawn unnecessary processes, so requirement (4) is satisfied. It will be intelligent enough to forward traffic only to processes at are healthy. So, requirement (5) is met. It will also load balance traffic across all the processes it governs, so requirement (6) is met. This service will also mitigate limitation (4) and (5) of our second design.You will be allowed to size your processes as needed, to make sure that you only use the resources needed. So, requirement (7) is met.It uses a central database called etcd, which stores the state of your entire cluster and keeps it updated at all times and accommodates for race conditions as well (multiple components updating the same information &#8211; it simply lets the first one to arrive win and fails the other one, forcing it to retry). We&#8217;ve solved problem (2) from our second design.It comes with logic to orchestrate our processes out of the box so there is no need to write any code. This mitigates limitation (3) of our second design.So, not only were we able to meet all of our requirements, we were also able to implement the solution we were trying to achieve, without writing any additional code for the orchestration! (You will just have to restructure your program a little and introduce APIs).How to Implement ThisJust note that in the k8s literature the smallest unit of computation is referred to as pod which performs a single function. Architecturally, this is identical to your description of sub-process. So, whenever I talk about &#8216;Pods&#8217; I simply refer to your sub-processes.You will take (roughly) the following steps to implement the proposed design.Rewrite some part of your application to decouple application from sub-process.py, introducing an API between them.Package sub-process.py into a container image.Deploy a small Kubernetes cluster.Create a Deployment using your sub-process.py image and set the min repica count to 1 and max to any number you want, say 10, for auto-scaling.Expose this Deployment by creating a Service. This is the &#8220;worker service&#8221; I talked about, and your application will &#8220;submit&#8221; requests to this service. And it will not have to worry about anything other than simply making a request to an API endpoint, everything else is handled by k8s.Configure your application to make API calls to this Service.Test your application and watch it scale up and down!Now, the way this will function is as follows:Client makes a request to your application.application forwards it to the Service API endpoint.The Service receives the API request and forwards it to one of the Pods that are running your sub-process.py functionality. If multiple requests are received the Service will balance the requests across all the pods that are available. If a pod fails, it will be take &#8220;away&#8221; from the service by K8s so requests don&#8217;t fail.The pod will perform your functionality and provide the output.If all the pods in the Service are reaching saturation, the Deployment auto-scaling will trigger and create more pods for the Service and load sharing will resume again (scale-out). If the resource utilisation then reduces, the Deployment will remove certain pods that are not being used anymore and you will be back to 1 pod (scale-in).If you want, you can put your frontend application into a Deployment and Service as well which will allow you to have an even friendlier cloud-native micro-service architecture. The user will interact with an API of your front-end which will invoke the Service that is managing your sub-process.py workers which will return results.I hope this helps you and you can appreciate how clearly the micro-service architecture fits into the design pattern you have, and how you can very simply adapt to it and scale your application as much as you want! Not only that, expressing your design this way will also allow you to redesign/manage/test different versions by simply managing a set of YAML manifests (text files) that you can use with Version Control as well!

Python subprocess is not scalable by default, any simple solution you can recommend to make it scalable?

Advertisement

Answer

Application Requirements

Limitations of the Current Design(s)

The Current Application Design

Scaling Design 1 – Simply Scaling Docker Containers

Scaling Design 2 – Use a DB as State Storage

Proposed Solution

The Proposed Application Design

The Proposed Scaling Solution – Enter Kubernetes!

How to Implement This