Skip to content
Advertisement

shell script that runs program that takes two input files

I want to develop a shell script that calls a program that requires two input files. The question is that, it is not only one pair that has to process it, but x number of pairs that are in the same directory. In the directory I have for example:

1nt.fa 2aa.fa 2nt.fa 2aa.fa 3nt.fa 3aa.fa 4nt.fa 4aa.fa 5nt.fa 5aa.fa

The command line of the program is as follows:

xvfb-run ete3 build -a 1aa.fa -n 1nt.fa -o mix_types -w standard_fasttree --clearall --nt-switch-threshold 0.0

And what I tried was the following, but it didn’t work.

#!/bin/bash
aa='eteanalysis/*.aa.fa'
nt='eteanalysis/*.nt.fa'
for f in eteanalysis/; do
    xvfb-run ete3 build
    -a $aa
    -n $nt
    -w standard_fasttree
    --clearall
    --nt-switch-threshold 0.0
    -o mixed_types/${f%.fasta}.ete3
done

Any ideas? … I have also tried it with parallel but it did not work either

Advertisement

Answer

Because the command should be called on the instersection of two sets, and from the question the file names differ by extension the second may be retrieved from the first by changing the extension.

#!/bin/bash
for aa in eteanalysis/*.aa.fa; do
    nt=${aa%.aa.fa}.nt.fa
    if [[ ! -e $nt ]]; then
        echo "$nt not found skipping.." >&2
        continue;
    fi
    xvfb-run ete3 build
    -a "$aa"
    -n "$nt"
    -w standard_fasttree
    --clearall
    --nt-switch-threshold 0.0
    -o mixed_types/${f%.fasta}.ete3
done

however the -o option must be changed because f is not set

Advertisement