Skip to content
Advertisement

How to skip multiple directories when doing a find

I’ve written a find function that searches for a string in each file in a given path, while skipping a list of directory names that I don’t want searched. I’ve placed this script in my .bashrc file to be called like so:

findTEXTinFILES /path/to/search 'text-to-find'

The find portion works great, and it colorizes the search text so that it visually stands out!, but I can’t get it to skip the directories listed with -prune. I’ve read all the posts here that I can find but none work for me. I’ve tried multiple variations with no luck. So I have a few questions:

  • How do you skip multiple directories?
  • How do you skip directories with just a partial name, such as those that begin with ‘–‘ or ‘wp-‘?
  • Can you mix -name and -path criteria in the same script?
  • Is there something else I’ve missed?

My server is CENTOS 6.9 virtuozzo with the bash shell.

function findTEXTinFILES {

find "$1" ! ( -name .bash_history -prune 
    -o ! -path tmp -prune 
    -o ! -path short -prune 
    -o ! -path "*/_not_used/*" -prune 
    -o ! -path backups -prune 
    -o ! -path temp_logs -prune 
    -o ! -name .cpan -prune 
    -o ! -name .cpobjcache -prune 
    -o ! -path files_to_compare -prune 
    -o ! -path logs -prune 
    -o ! -path mail -prune  
    -o ! -path old -prune 
    -o ! -path '--*' -prune 
    -o ! -path 'wp-*' -prune 
    -o ! -path '*copy*' -prune ) 
    -o -name "*" 
    -exec grep $2 -I --color -Hn '$3' '{}' 2>/dev/null ;
}

Advertisement

Answer

A find expression is composed primarily of tests and actions joined together with operators. It is evaluated in a standard short-circuiting manner — meaning the evaluation is stopped as soon as the result is known, without the need to evaluate all parts (e.g. true or anything is evaluated to true).

Now note that -prune is an action that always returns true. It can act on the result of any test. Also note that the default operator is -a (and).

So, the simplest pruning example, to print all files except those under some path (e.g. wp-* in your example) looks like:

find . -path './wp-*' -prune -o -print

For files matching the path starting with ./wp-, prune action is executed, meaning the result is true, and the right part of the OR operator can be ignored (i.e. file is not printed). Note here that -path matches relative path, in this case rooted at ., so we have to write ./wp-* instead of wp-*.

To prune two paths, simply extend:

find . -path './wp-*' -prune -o -path ./logs -prune -o -print

Here: if first prune action is not executed (result false), then a chance is given to the second, if that doesn’t prune neither (result false), then -print action is executed. In case any -prune gets evaluated, -print doesn’t get a chance.

Applying this to your case:

find "$1" -name .bash_history -prune 
    -o -path "$1/tmp" -prune 
    -o -path "$1/short" -prune 
    -o -path "$1/*/_not_used/*" -prune 
    -o -path "$1/backups" -prune 
    -o -path "$1/temp_logs" -prune 
    -o -name "$1/.cpan" -prune 
    -o -name "$1/.cpobjcache" -prune 
    -o -path "$1/files_to_compare" -prune 
    -o -path "$1/logs" -prune 
    -o -path "$1/mail" -prune  
    -o -path "$1/old" -prune 
    -o -path "$1/--*" -prune 
    -o -path "$1/wp-*" -prune 
    -o -path "$1/*copy*" -prune 
    -exec grep $2 -I --color -Hn '$3' '{}' 2>/dev/null ;

To avoid writing $1-dependent paths, you can cd "$1" and use f.e. find . ... -path ./logs ....

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement