Skip to content
Advertisement

How to extract names of compound present in sub files?

I have a list of 15000 compound names (file name: uniq-compounds) which contains names of 15000 folder. the folder have sub files i.e. out.pdbqt which contains names of compound in 3rd Row. (Name = 1-tert-butyl-5-oxo-N-[2-(3-pyridinyl)ethyl]-3-pyrrolidinecarboxamide). I want to extract all those 15000 names by providing uniq-compound file (it contain folder names e.g ligand_*) out of 50,000 folder.

directory and subfiles

JavaScript

out.pdbqt

JavaScript

Advertisement

Answer

Assuming, uniq-compound.txt contains the folder names and each folder contains an out.pdbqt. Also, the compound name appears in the 3rd row of the file out.pdbqt. If that is the case below script will work:

JavaScript

Loop will iterate through the uniq-compound.txt one by one, for each line in the file (i.e folder), it uses awk to display the 4th column in the 3rd line of the file out.pdbqt inside that folder.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement