How to extract names of compound present in sub files?

Question

I have a list of 15000 compound names (file name: uniq-compounds) which contains names of 15000 folder. the folder have sub files i.e. out.pdbqt which contains names of compound in 3rd Row. (Name = 1-tert-butyl-5-oxo-N-[2-(3-pyridinyl)ethyl]-3-pyrrolidinecarboxamide). I want to extract all those 15000 names by providing uniq-compound file (it contain folder names e.g ligand_*) out of 50,000 folder. directory and subfiles

Accepted Answer

Assuming, uniq-compound.txt contains the folder names and each folder contains an out.pdbqt. Also, the compound name appears in the 3rd row of the file out.pdbqt. If that is the case below script will work:#!/bin/bashwhile IFS= read -r line; do    awk 'FNR == 3 {print $4}' $line/out.pdbqt done < uniq-compound.txtLoop will iterate through the uniq-compound.txt one by one, for each line in the file (i.e folder), it uses awk to display the 4th column in the 3rd line of the file out.pdbqt inside that folder.

directory and subfiles

out.pdbqt

Advertisement

Answer