I’ve been trying to get a script working to backup some files from one machine to another but have been running into an issue.
Basically what I want to do is copy two files, one .log and one (or more) .dmp. Their format is always as follows:
something_2022_01_24.log something_2022_01_24.dmp
I want to do three things with these files:
- find the second to last one .log file (i.e. something_2022_01_24.log is the latest,I want to find the one before that say something_2022_01_22.log)
- get a substring with just the date (2022_01_22)
- copy every .dmp that matches the date (i.e something_2022_01_24.dmp, something01_2022_01_24.dmp)
For the first one from what I could find the best way is to do: ls -t *.log | head-2 as it displays the second to last file created.
As for the second one I’m more at a loss because I’m not sure how to parse the output of the first command.
The third one I think I could manage with something of the sort:
[ -f "/var/www/my_folder/*$capturedate.dmp" ] && cp "/var/www/my_folder/*$capturedate.dmp" /tmp/
What do you guys think is there any way to do this? How can I compare the substring?
Thanks!
Advertisement
Answer
Would you please try the following:
#!/bin/bash dir="/var/www/my_folder" second=$(ls -t "$dir/"*.log | head -n 2 | tail -n 1) if [[ $second =~ .*_([0-9]{4}_[0-9]{2}_[0-9]{2}).log ]]; then capturedate=${BASH_REMATCH[1]} cp -p "$dir/"*"$capturedate".dmp /tmp fi
second=$(ls -t "$dir"/*.log | head -n 2 | tail -n 1)
will pick the second to last log file. Please note it assumes that the timestamp of the file is not modified since it is created and the filename does not contain special characters such as a newline. This is an easy solution and we may need more improvement for the robustness.- The regex
.*_([0-9]{4}_[0-9]{2}_[0-9]{2}).log
will match the log filename. It extracts the date substring (enclosed with the parentheses) and assigns the bash variable${BASH_REMATCH[1]}
to it. - Then the next
cp
command will do the job. Please be cateful not to include the widlcard*
within the double quotes so that the wildcard is properly expanded.
FYI here are some alternatives to extract the date string.
With sed
:
capturedate=$(sed -E 's/.*_([0-9]{4}_[0-9]{2}_[0-9]{2}).log/1/' <<< "$second")
With parameter expansion of bash (if something
does not include underscores):
capturedate=${second%.log} capturedate=${capturedate#*_}
With cut
command (if something
does not include underscores):
capturedate=$(cut -d_ -f2,3,4 <<< "${second%.log}")