Skip to content
Advertisement

Not able to get output file with data while creating a docker image from a Python script using Docker for windows

I am using Docker for windows with Linux containers, I have created a simple python script where I need to take input from 2 text files and append them and export them into a text file. Below is the code for the test_script.py

#including libraries
import pandas as pd
from os import path

#setting path to data
path2data1 = './data1'
path2data2 = './data2'
path2output = './'

#reading input file
input_data1 = pd.read_table(path.join(path2data1,"sample_data_input1.txt"))
input_data2 = pd.read_table(path.join(path2data2,"sample_data_input2.txt"))

#adding both the data
combined_data = input_data1.append(input_data2, ignore_index = True)

#print data in a output file
combined_data.to_csv(path.join(path2output, 'outputdata.csv'), 
                   header=True, index=False, encoding='utf-8')

Now I am trying to create a docker container with this, I want to just pass the folder location as the data keeps on changing everyday. Also I want the output file after running the docker image.

I wrote the following Dockerfile

# Use an official Python runtime as a parent image
FROM python:3
ENV http_proxy http://proxy-chain.xxx.com:911/
ENV https_proxy http://proxy-chain.xxx.com:912/


COPY . /app
WORKDIR /app/

# Install any needed packages specified
RUN pip install pandas

# Run test_script.py when the container launches
CMD ["python", "test_script.py"] 

So I am building the docker image using docker build -t test_build . . It is building successfully without any error.

I am running the image with docker run --volume ./test_script.py:/test_script.py test_build > ./output.txt then It is creating the output file but that is coming to be empty.

How can I get the data along with the file

Advertisement

Answer

@archit you need to attach a volume to your docker.
A volume is the only way that you can persist your output file and also the way your docker will get the input file to run on every time you want to use the docker.

docker run 
  -v host_volume:/app 
  test_build

In it you should put your input file that you want the docker to use, not your script, that one you added when you built the docker.

I suggest one of two things:

  1. Change your code to take the most update input file in the volume directory and execute it, that way you don’t need to pass it any params every time you run it.
  2. Change your docker file from CMD to ENTRYPOINT.
    Then when you run it you can do this:
    docker run -it -v path_in_your_comp:path_inside_your_docker test_build path_inside_your_docker/input_file_name path_inside_your_docker/output_file_name
    You need to have your python script able to read this params when you start it, via the command args. keep in mind that the path is the name you mapped the volume inside your docker.
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement