I’m trying to make a python script which is going run a bash script on a remote machine via ssh and then parse its output. The bash script outputs lot of data (like 5 megabytes of text / 50k lines) in stdout and here is a problem – I’m getting all the data only in ~10% cases. In other 90% cases I’m getting about 97% of what i expect and it looks like it always trims at the end. This is how my script looks like:
import subprocess import re import sys import paramiko def run_ssh_command(ip, port, username, password, command): ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect(ip, port, username, password) stdin, stdout, stderr = ssh.exec_command(command) output = '' while not stdout.channel.exit_status_ready(): solo_line = '' # Print stdout data when available if stdout.channel.recv_ready(): # Retrieve the first 1024 bytes solo_line = stdout.channel.recv(2048). output += solo_line ssh.close() return output result = run_ssh_command(server_ip, server_port, login, password, 'cat /var/log/somefile') print "result size: ", len(result)
I’m pretty sure that problem is in overflowing of some internal buffer, but which one and how to fix it?
Thank you very much for any tip!
Advertisement
Answer
When stdout.channel.exit_status_ready()
starts returning True
, there might still be a lot of data on the remote side, waiting to be sent. But you only receive one more chunk of 2048 bytes and quit.
Instead of checking the exit status, you could keep calling recv(2048)
until it returns an empty string, which means that no more data is coming:
output = '' next_chunk = True while next_chunk: next_chunk = stdout.channel.recv(2048) output += next_chunk
But really you probably just want:
output = stdout.read()