I’m trying to make a python script which is going run a bash script on a remote machine via ssh and then parse its output. The bash script outputs lot of data (like 5 megabytes of text / 50k lines) in stdout and here is a problem – I’m getting all the data only in ~10% cases. In other 90% cases I’m getting about 97% of what i expect and it looks like it always trims at the end. This is how my script looks like:
import subprocess
import re
import sys
import paramiko
def run_ssh_command(ip, port, username, password, command):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip, port, username, password)
stdin, stdout, stderr = ssh.exec_command(command)
output = ''
while not stdout.channel.exit_status_ready():
solo_line = ''
# Print stdout data when available
if stdout.channel.recv_ready():
# Retrieve the first 1024 bytes
solo_line = stdout.channel.recv(2048).
output += solo_line
ssh.close()
return output
result = run_ssh_command(server_ip, server_port, login, password, 'cat /var/log/somefile')
print "result size: ", len(result)
I’m pretty sure that problem is in overflowing of some internal buffer, but which one and how to fix it?
Thank you very much for any tip!
Advertisement
Answer
When stdout.channel.exit_status_ready()
starts returning True
, there might still be a lot of data on the remote side, waiting to be sent. But you only receive one more chunk of 2048 bytes and quit.
Instead of checking the exit status, you could keep calling recv(2048)
until it returns an empty string, which means that no more data is coming:
output = ''
next_chunk = True
while next_chunk:
next_chunk = stdout.channel.recv(2048)
output += next_chunk
But really you probably just want:
output = stdout.read()