Skip to content
Advertisement

Adding prefix of directory name to all rows of a column in Bash

I want to add a prefix of a folder’s name to all rows of a csv file. The aim is to combine this awk command with the find command so I can automate it and apply it to all directories and subdirectories within a folder. Trying to output the result to a new csv file _prefix.csv to be safe.

find . -name "*.fasta"  -exec bash -c '
     for file do        
     prefix="${PWD##*/}" 
     awk -v a="$prefix" {if(NR==1){print; next}; $1="$a"_$1; print} %P >> %P_prefix.csv"
     done' _ {}

What I have:

27S_544
 - contigs.fasta
             ID | Rds 
         864585 | XX 

 - scaffolds.fasta
             ID | Rds 
         845335 | XX  


28S_545
  - contigs.fasta
             ID | Rds 
         867685 | XX  
  - scaffolds.fasta
             ID | Rds 
         867634 | XX 

Desired output:

 27S_544
     - contigs.fasta
                  ID | Rds 
      27S_544_864585 | XX  

     - scaffolds.fasta
                  ID | Rds 
      27S_544_845335 | XX  


  28S_545
      - contigs.fasta
                   ID | Rds 
       28S_545_867685 | XX  

      - scaffolds.fasta
                   ID | Rds 
       28S_545_867634 | XX 

Error

find: missing argument to `-exec

Advertisement

Answer

Instead of dealing with complicated quotes, consider reading the stream line by line:

find . -name "*.fasta" |
while IFS= read -r file; do
     prefix="${PWD##*/}"
     # awk IS NOT bash
     #  in bash: "${a}_" to concatenate variable a with a _
     #  in awk:  a "_" 
     awk -v a="$prefix" '{if(NR==1){print; next}; $1 = a "_" $1; print}' "$file" >> "${file}_prefix.csv"
     # or just:
     # awk -v a="$prefix" 'NR!=1{$1=a"_"$1}1'
done

After getting hang of it, then re-escape it for a subshell when needed:

find . -name "*.fasta"  -exec bash -c '
     prefix="${PWD##*/}"
     awk -v a="$prefix" '''{if(NR==1){print; next}; $1="$a"_$1; print}''' "$1" >> "${1}_prefix.csv"
 ' _ {} ;

I recommend reading bashfaq/001 and revisiting the man page of find and research about -exec and -printf and re-reading introductions to bash and awk variables handling.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement