there are many of files in this directory:
[ichen@ui01 data]$ ls data.list data.root ntuple.data15_13TeV.00276262.DAOD_FTAG2.root ntuple.data15_13TeV.00276329.DAOD_FTAG2.root ntuple.data15_13TeV.00276336.DAOD_FTAG2.root ntuple.data15_13TeV.00276416.DAOD_FTAG2.root ntuple.data15_13TeV.00276511.DAOD_FTAG2.root
and i want to make a list which just contains those files which have the pattern of:
[many chars].[many chars].[many numbers].[many chars].root
to match the file names such like:
ntuple.data15_13TeV.00276262.DAOD_FTAG1.root ntuple.data15_13TeV.00276329.DAOD_FTAG2.root ntuple.data15_13TeV.00276336.DAOD_FTAG3.root etc...
how can I use regexp to achieve this goal? Maybe we can use this syntax:
for f in `ls`;do if [....];then echo $f;fi;done > log.list
Advertisement
Answer
In regexp land, many roads lead to rome. 🙂
ls | egrep '^w*.w*.[0-9]*.w*.root$'
^ marks the beginning of a line $ marks the end of a line w is a word character w* is many work characters . is a literal ‘.’ character, an unmasked ‘.’ in the regurlar expression stands for “any character” [0-9] is any of the numbers between 0 and 9
And for your specific example:
for f in `ls`;do echo $f | egrep '^w*.w*.[0-9]*.w*.root$';done
And now including the if statement:
for f in `ls`; do if [[ $f =~ 'w*.w*.[0-9]*.w*.root' ]]; then echo $f; fi; done
In this case, I had to remove the line beginning and end (^…$) for it to match. Not sure why. In general, =~ will check for regular expressions.