I have 4 files extensions as result of previous works, stored in the $SEARCH array, as follows :
declare -a SEARCH=("toggled" "jtr" "jtr.toggled" "cupp")
I want to issue one file list for each of the 4 above extension patterns, as follows, except for the case with 2 dots and 2 extensions (marked “NO”) :
################################################################################ 1 - SEARCH FOR toggled in /media regex : ([^/]+)(.)(toggled)$ command : find /media -type f | grep --color -P ([^/]+)(.)(toggled)$ ################################################################################ /media/myfile_1.jtr.toggled --> NO /media/myfile_1.toggled /media/myfile_2.jtr.toggled --> NO /media/myfile_2.toggled /media/myfile_3.jtr.toggled --> NO /media/myfile_3.toggled ################################################################################ 2 - SEARCH FOR jtr in /media regex : ([^/]+)(.)(jtr)$ command : find /media -type f | grep --color -P ([^/]+)(.)(jtr)$ ################################################################################ /media/myfile_1.jtr /media/myfile_2.jtr /media/myfile_3.jtr ################################################################################ 3 - SEARCH FOR jtr.toggled in /media regex : ([^/]+)(.)(jtr.toggled)$ command : find /media -type f | grep --color -P ([^/]+)(.)(jtr.toggled)$ ################################################################################ /media/myfile_1.jtr.toggled /media/myfile_2.jtr.toggled /media/myfile_3.jtr.toggled ################################################################################ 4 - SEARCH FOR cupp in /media regex : ([^/]+)(.)(cupp)$ command : find /media -type f | grep --color -P ([^/]+)(.)(cupp)$ ################################################################################ /media/myfile_1.cupp /media/myfile_2.cupp /media/myfile_3.cupp
Obviously I spent hours on regex101 w/o success. I also tried to achieve my target with other methods, which does not fit with the rest of the code.
Here is a code extract :
for ext in "${SEARCH[@]}" do COUNTi=$((COUNTi+1)) REGEX="([^/]+)(.)("$ext")$" # # Ideally, the Regex should come from a pattern array printf '%*s' "$len" | tr ' ' "$mychar" echo -e "n$COUNTi - SEARCH FOR $ext in $BASEDIR" echo "regex : $REGEX" echo "command : find $BASEDIR -type f | grep --color -P $REGEX" printf '%*s' "$len" | tr ' ' "$mychar" && echo find $BASEDIR -type f | grep --color -P $REGEX # the Regex caveats as the double dot extensions are not parsed correctly. echo -e "n" done
So my 2 questions related to the same piece of code :
REGEX : what would be a correct regex, to be able to parse and dump the files by extension family (pls see the 4 SEARCH patterns and related dumps) ?
ARRAYS : Once above point is solved, how to use a pattern array data, containing the $extension placeholder, into the looped REGEX ?
PATTERN+=( "([^/]+)(.)($ext)$" ) # All of these below : CAVEATS escaping $ or not... # REGEX=${PATTERN[5]} # REGEX=$(eval "${PATTERN[5]}" ) # echo "pattern : ${PATTERN[5]}" # eval "$REGEX=$REGEX" # eval "$REGEX="$REGEX"" # REGEX=$(echo "${REGEX}") # REGEX=${!PATTERN[5]}
Notes:
I read all regex documentations for hours, tried hundreds of regex patterns, w/o success as I can’t understand these regex rationales.
I also tried other ways, for example find / -name "sayONEnameinmysearchpattern" ! -iname "theothernamesfromtehsearchpattern"
. This is not what I’m looking for.
Thx
Advertisement
Answer
Change the REGEX line in your code to:
REGEX='^(.*/|)[^/.]+.'"$ext$"
The perl regular expression to match the basename of the file is in single quotes. This prevents the shell from trying to expand it. The $ext is in double quotes, so it will be expanded by the shell. The trailing $ is escaped with a backslash just for form.
The leading ^(.*/|) will match a leading directory (ending with /), the [^/.]+ will match one or more characters that are NOT ‘.’ or ‘/’. That must then be followed by a ‘.’ and your extension, followed by the end of the file name ($) to match.
The key here is to anchor your match at both ends (^ and $) and not allow any dots ‘.’ except the ones you really want.
You also might want to put $REGEX in quotes… “$REGEX” in the grep command near the end of your code extract.