I searched online but I didn’t find anything that could answer my question.
I’m using a java tool in Ubuntu Linux, calling it with bash command; this tool has two paths for two different input files:
java -Xmx8G -jar picard.jar FastqToSam FASTQ=6484_snippet_1.fastq #first read file of pair FASTQ2=6484_snippet_2.fastq #second read file of pair [...]
What I’d like to do is for example, instead of specify the path of a single FASTQ
, specify the path of two different files.
So instead of having cat file1 file2 > File
and using File
as input of FASTQ
, I’d like that this operation would be executed on the fly and create the File
on the fly, without saving it on the file system (that would be what happens with the command cat file1 file2 > File
).
I hope that I’ve been clear in explaining my question, in case just ask me and I’ll try to explain better.
Advertisement
Answer
Most well-written shell commands which accept a file name argument also usually accept a list of file name arguments. Like cat file
or cat file1 file2
etc.
If the program you are trying to use doesn’t support this, and cannot easily be fixed, perhaps your OS or shell makes /dev/stdin
available as a pseudo-file.
cat file1 file2 | java -mumble -crash -burn FASTQ=/dev/stdin
Some shells also have process substitutions, which (typically) look to the calling program like a single file containing whatever the process substitution produces on standard output.
java -mumble -crash -burn FASTQ=<(cat file1 file2) FASTQ2=<(cat file3 file4)
If neither of these work, a simple shell script which uses temporary files and deletes them when it’s done is a tried and true solution.
#!/bin/sh : ${4?Need four file name arguments, will process them pairwise} t=$(mktemp -d -t fastqtwoness.XXXXXXX) || exit trap 'rm -rf $t' EXIT HUP INT TERM # remove in case of failure or when done cat "$1" "$2" >$t/1.fastq cat "$3" "$4" >$t/2.fastq exec java -mumble -crash -burn FASTQ=$t/1.fastq FASTQ2=$t/2.fastq