I am using the Tracer software package (https://github.com/Teichlab/tracer). The program is invoked as followed:
tracer assemble [options] <file_1> [<file_2>] <cell_name> <output_directory>
The program runs on a single dataset and the output goes to /<output_directory>/<cell_name>
What I want to do now is run this program on multiple files. To do so this is what I do:
for filename in /home/tobias/tracer/datasets/test/*.fastq do echo "Processing $filename file..." python tracer assemble --single_end --fragment_length 62 --fragment_sd 1 $filename Tcell_test output; done
This works in priciple, but as cell_name
is static, every iteration overwrites the output from the previous iteration. How do I need to change my script in order to give the output folder the name of the input file?
For example: Input filename is tcell1.fastq
. For this cell_name should be tcell1
. Next file is tcell2.fastq
and cell_name should be tcell2
, and so on…
Advertisement
Answer
I think this will do it, in bash, if I understand correctly –
for filename in /home/tobias/tracer/datasets/test/*.fastq do echo "Processing $filename file..." basefilename="${filename##*/}" #<--- python tracer assemble --single_end --fragment_length 62 --fragment_sd 1 "$filename" "${basefilename%.fastq}" output; # ^^^^^^^^^^^^^^^^^^^^^^^^ done
${filename##*/}
removes the part up to the last /
, and ${basefilename%.fastq}
removes the .fastq
at the end.