Skip to content
Advertisement

How can I match this pattern of file name in a directory, and output the matched?

there are many of files in this directory:

[ichen@ui01 data]$ ls
data.list
data.root
ntuple.data15_13TeV.00276262.DAOD_FTAG2.root
ntuple.data15_13TeV.00276329.DAOD_FTAG2.root
ntuple.data15_13TeV.00276336.DAOD_FTAG2.root
ntuple.data15_13TeV.00276416.DAOD_FTAG2.root
ntuple.data15_13TeV.00276511.DAOD_FTAG2.root

and i want to make a list which just contains those files which have the pattern of:

    [many chars].[many chars].[many numbers].[many chars].root

to match the file names such like:

ntuple.data15_13TeV.00276262.DAOD_FTAG1.root
ntuple.data15_13TeV.00276329.DAOD_FTAG2.root
ntuple.data15_13TeV.00276336.DAOD_FTAG3.root
etc...

how can I use regexp to achieve this goal? Maybe we can use this syntax:

for f in `ls`;do if [....];then echo $f;fi;done  > log.list

Advertisement

Answer

In regexp land, many roads lead to rome. 🙂

ls | egrep '^w*.w*.[0-9]*.w*.root$'

^ marks the beginning of a line $ marks the end of a line w is a word character w* is many work characters . is a literal ‘.’ character, an unmasked ‘.’ in the regurlar expression stands for “any character” [0-9] is any of the numbers between 0 and 9

And for your specific example:

for f in `ls`;do echo $f | egrep '^w*.w*.[0-9]*.w*.root$';done

And now including the if statement:

for f in `ls`; do if [[ $f =~ 'w*.w*.[0-9]*.w*.root' ]]; then echo $f; fi; done

In this case, I had to remove the line beginning and end (^…$) for it to match. Not sure why. In general, =~ will check for regular expressions.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement