I tried doing
import subprocess p = subprocess.Popen("ls -la /etc", stdout=subprocess.PIPE, stderr=subprocess.PIPE) p.stdout.read().decode()
Which gives me
FileNotFoundError: [Errno 2] No such file or directory: 'ls -la /etc': 'ls -la /etc'
Following
Python subprocess.Popen with var/args
I did
import subprocess p = subprocess.Popen(["ls", "-la", "/etc"], stdout=subprocess.PIPE, stderr=subprocess.PIPE) p.stdout.read().decode()
Which did work.
Why is that? Why do I have to split my command and its arguments? What’s the rationale behind this design?
Python version:
3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0]
Advertisement
Answer
That’s how all process invocations work on UNIX.
Under the hood, running a program on UNIX is traditionally done with the following steps:
fork()
off a child process.- In that child process, open new copies of stdin, stdout, stderr, etc if redirections are requested, using the
dup2()
call to assign the newly-opened files over the file descriptors that are redirection targets. - In that child process, use the
execve()
syscall to replace the current process with the desired child process. This syscall takes an array of arguments, not a single string. wait()
for the child to exit, if the call is meant to be blocking.
So, subprocess.Popen
exposes the array interface, because the array interface is what the operating system actually does under the hood.
When you run ls /tmp
at a shell, that shell transforms the string into an array and then does the above steps itself — but it gives you more control (and avoids serious bugs — if someone creates a file named /tmp/$(rm -rf ~)
, you don’t want trying to cat /tmp/$(rm -rf ~)
to delete your home directory) when you do the transformations yourself.