I am using Docker for windows with Linux containers, I have created a simple python script where I need to take input from 2 text files and append them and export them into a text file. Below is the code for the test_script.py
#including libraries import pandas as pd from os import path #setting path to data path2data1 = './data1' path2data2 = './data2' path2output = './' #reading input file input_data1 = pd.read_table(path.join(path2data1,"sample_data_input1.txt")) input_data2 = pd.read_table(path.join(path2data2,"sample_data_input2.txt")) #adding both the data combined_data = input_data1.append(input_data2, ignore_index = True) #print data in a output file combined_data.to_csv(path.join(path2output, 'outputdata.csv'), header=True, index=False, encoding='utf-8')
Now I am trying to create a docker container with this, I want to just pass the folder location as the data keeps on changing everyday. Also I want the output file after running the docker image.
I wrote the following Dockerfile
# Use an official Python runtime as a parent image FROM python:3 ENV http_proxy http://proxy-chain.xxx.com:911/ ENV https_proxy http://proxy-chain.xxx.com:912/ COPY . /app WORKDIR /app/ # Install any needed packages specified RUN pip install pandas # Run test_script.py when the container launches CMD ["python", "test_script.py"]
So I am building the docker image using docker build -t test_build .
. It is building successfully without any error.
I am running the image with docker run --volume ./test_script.py:/test_script.py test_build > ./output.txt
then It is creating the output file but that is coming to be empty.
How can I get the data along with the file
Advertisement
Answer
@archit you need to attach a volume to your docker.
A volume is the only way that you can persist your output file and also the way your docker will get the input file to run on every time you want to use the docker.
docker run -v host_volume:/app test_build
In it you should put your input file that you want the docker to use, not your script, that one you added when you built the docker.
I suggest one of two things:
- Change your code to take the most update input file in the volume directory and execute it, that way you don’t need to pass it any params every time you run it.
- Change your docker file from
CMD
toENTRYPOINT
.
Then when you run it you can do this:
docker run -it -v path_in_your_comp:path_inside_your_docker test_build path_inside_your_docker/input_file_name path_inside_your_docker/output_file_name
You need to have your python script able to read this params when you start it, via the command args. keep in mind that the path is the name you mapped the volume inside your docker.