I have a great amount of Linux servers to maintain. Frequently I need to run a script (script.sh) on all of them to get the health status, this script usually takes about 30-40 seconds to give an output. To facilitate maintenance tasks, I’m writing a shell script that uses SSH to loop through all remote hosts, run script.sh, collect output and write it to a log file in my local host. For the sake of this question, I have named this script MyScript.sh
The script works fine, however, it has to wait for the SSH output to continue to the next host. Because I have too many servers, and the commands runs in sequence, it take several minutes to finish. I would like to loop through all servers in parallel, without needing to wait for a response from each host.
Is there a way I can remotely run script.sh simultaneously on all host using MyScript.sh? Maybe run the ssh command in the background and somehow collect the output?
The output of script.sh is a single line separated by pipes. Such as the following
host1|49 days|10%|3.77%|27677/63997 MB|43% - /usr|38% - /usr|Optimal|No|40%|No
The output of Myscript.sh is the concatenation of the output from all host without pipes.
Date Hostname Uptime CPU I/O Free MEM File System INODES STATUS WWW YYY ZZZ XXX =================================================================================================================================================================================================== 01/31/20 host1 44 days 5% 10.33% 38083/64000 MB 57% - / 37% - /usr OPTIMAL No 40% No 01/31/20 host2 45 days 11% 1.79% 27915/63997 MB 43% - /usr 38% - /usr OPTIMAL UP 7% OK 01/31/20 host3 45 days 2% 1.89% 32145/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO OK 01/31/20 host4 45 days 11% 3.72% 52477/128637 MB 49% - /var 38% - /usr OPTIMAL UP 8% OK 01/31/20 host5 45 days 6% 3.21% 65264/128637 MB 46% - /var 38% - /usr OPTIMAL UP NO OK 01/31/20 host6 45 days 7% 5.79% 56369/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO No 01/31/20 host7 45 days 6% 1.66% 56391/63997 MB 43% - /var 38% - /usr OPTIMAL UP NO No
The core of MyScript.sh is the following:
( for ip in $IP_LIST; do echo "Checking $ip" ssh -q -t $user@$ip 'sudo /tmp/script.sh' > /tmp/$$ current_date=$(date +%D) printf "%-10s " "$current_date" >> $logfile while read line; do echo $line | awk -F '|' '{printf("%-10s %-10s %-7s %-8s %-18s %-25s %-25s %-15s %-15s %-25s %-10sn",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11); }' >> $logfile done< /tmp/$$ done )
In summary, I would like to optimize this script to run the above code simultaneously on multiple servers. Thanks!
Advertisement
Answer
The solution could be to deploy a monitoring software with custom checks.
For the parrallel ssh
problem, without install any binaries you could use this script I wrote a while ago.
Put in a file mssh
, run chmod u+x mssh
and then :
./mssh -s SERVER1 -s SERVER2 -C script.sh
The mssh
file :
#!/usr/bin/env bash readonly prog_name="$(basename "$0")" readonly date="$(date +%Y%m%d_%H%M%S)" # print help usage() { cat <<- EOF usage: $prog_name options parallel ssh executions. OPTIONS: -c --cmd CMD execute command CMD -s --host SRV execute cmd on server SRV -C --cmd CMD_FILE execute command contained in CMD_FILE -S --hosts-file SRV_FILE execute cmd on all servers contained in SRV_FILE -h --help show this help Examples: Run CMD on SERVER1 and SERVER2: ./$prog_name -s SERVER1 -s SERVER2 -c "CMD" EOF } # test if an element is in an array is_element(){ local search=$1; shift; for e in "$@"; do [[ "$e" == "$search" ]] && return 0; done return 1 } # parse arguments for arg in "$@"; do case "$arg" in --help) args+=( -h );; --host) args+=( -s );; --hosts-file) args+=( -S );; --cmd) args+=( -c );; --cmd-file) args+=( -C );; *) args+=("$arg");; esac done set -- "${args[@]}" while getopts "hs:S:c:C:" OPTION; do case $OPTION in h) usage; exit 0;; s) servers_array+=("$OPTARG");; S) while read -r L; do servers_array+=("$L"); done < <( grep -vE "^ *(#|$)" "$OPTARG");; c) cmd="$OPTARG";; C) cmd="$(< "$OPTARG")"; file=$OPTARG;; *) :;; esac done if [[ -z ${servers_array[0]} ]] || [[ -z $cmd ]]; then usage; exit 1 fi # clean up created files at exit trap "rm -f /tmp/pssh*$date" EXIT [[ -n $file ]] && echo "executing command file : $file" || echo "executing command : $cmd" # run cmd on each server for i in "${!servers_array[@]}"; do # executing cmd in subshell ssh -n "${servers_array[$i]}" "$cmd" > "/tmp/pssh_${i}_${servers_array[$i]}_${date}" 2>&1 & pid=$! pids_array+=("$pid") echo "${servers_array[$i]} - $pid" done # for each pid, set state to running ps_state_array=( $(for i in "${!servers_array[@]}"; do echo "running"; done) ) echo "waiting for results..." echo # begin finished verifications continue=true; attempt=0 while $continue; do # foreach ps for i in "${!pids_array[@]}"; do # if already finished skip [[ ${ps_state_array[$i]} == "finished" ]] && continue # else check if finished ps -o pid "${pids_array[$i]}" > /dev/null 2>&1 && ps_finished=false || ps_finished=true if $ps_finished; then ps_state_array[$i]="finished" echo -e "[ ${servers_array[$i]} @ $(date +%H:%M:%S) ]" | grep '.*' --color=always cat "/tmp/pssh_${i}_${servers_array[$i]}_${date}" rm -f "/tmp/pssh_${i}_${servers_array[$i]}_${date}" echo fi done is_element "running" "${ps_state_array[@]}" || continue=false if $continue; then (( attempt < 5 )) && attempt=$(( attempt + 1 )) sleep $attempt fi done exit 0