Skip to content
Advertisement

Pass argument to awk inside do loop

I have a large number of tab-separated text files containing a score I’m interested in in the second column:

test_score_1.txt

Title   FRED Chemgauss4 File
24937   -6.111582   A
24972   -7.644171   A
26246   -8.551361   A
21453   -7.291059   A

test_score_2.txt

Title   FRED Chemgauss4 File
14721   -7.322331   B
27280   -6.229842   B
21451   -8.407396   B
10035   -7.482369   B
10037   -7.706176   B

I want to check if I have Titles with a score smaller than a number I define.

The following code defines my score in the script and works:

check_score_1

#!/bin/bash

find . -name 'test_score_*.txt' -type f -print0 |
while read -r -d $'' x; do
    awk '{FS = "t" ; if ($2 < -7.5) print $0}' "$x"
done

If I try to pass an argument to awk like so check_scores_2.sh "-7.5" as shown in check_score_2.sh, that returns all entries from both files.

check_scores_2.sh

#!/bin/bash

find . -name 'test_score_*.txt' -type f -print0 |
while read -r -d $'' x; do
    awk '{FS = "t" ; if ($2 < ARGV[1]) print $0}' "$x"
done

Finally, check_scores_3.sh reveals that I’m actually not passing any arguments from my command line.

check_scores_3.sh

#!/bin/bash

find . -name 'test_score_*.txt' -type f -print0 |
while read -r -d $'' x; do
    awk '{print ARGV[0] "t" ARGV[1] "t" ARGV[2]}' "$x"
done

$ ./check_score_3.sh "-7.5" gives the following output:

awk ./test_score_1.txt  
awk ./test_score_1.txt  
awk ./test_score_1.txt  
awk ./test_score_1.txt  
awk ./test_score_1.txt  
awk ./test_score_2.txt  
awk ./test_score_2.txt  
awk ./test_score_2.txt  
awk ./test_score_2.txt  
awk ./test_score_2.txt  
awk ./test_score_2.txt  

What am I doing wrong?

Advertisement

Answer

In your shell script, the first argument to the shellscript is available as $1. You can assign that value to an awk variable as follows:

find . -name 'test_score_*.txt' -type f -exec awk -v a="$1" -F't' '$2 < a' {} +

Discussion

  • Your print0/while read loop is very good. The -exec option offered by find, however, makes it possible to run the same command without any explicit looping.

  • The command {if ($2 < -7.5) print $0} can optionally be simplified to just the condition $2 < -7.5. This is because the default action for a condition is print $0.

  • Note that the references $1 and $2 are entirely unrelated to each other. Because $1 is in double-quotes, the shell substitutes in for it before the awk command starts to run. The shell interprets $1 to mean the first argument to the script. Because $2 appears in single quotes, the shell leaves it alone and it is interpreted by awk.  Awk interprets it to mean the second field of its current record.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement