I have a directory with backups. One backup file is created each day. I need to remove these files as they fill up disk space over time. I want to remove all files older than 5 days, BUT I need to keep a minimum of 5 files in the directory. This is in case the backup utility fails 5 days in a row. I currently have it just deleting anything older than 5 days. But, that could be very concerning considering the scenario I just described.
I want to:
- Run a find on a directory sorted by file created date
- Remove the 5 most recent lines (files)
- Run rm on the files that are left from the find command
I can do this in Python or a multitude of ways, but would prefer to keep it simple with a bash script.
Is there a way to do this in bash? I was thinking of maybe piping it to a file then removing the lines that way. Don’t know if that is smarter.
Advertisement
Answer
Use python.
If you KNOW there are no filenames with embedded spaces, newlines, or other naughty bits, you could do something like this:
stat -c "%W %n" * | sort -rn | head -5 | awk '{print $2}' | xargs rm
Edit/upgrade, per KamilKuk, with thanks! Nulls FTW!
stat --printf="%Wt%n" * | sort -z -rn | head -z -n -5 | cut -z -f2- | xargs -0 rm
…but you don’t, because people will find a way to do something stupid sooner or later and botch up your filesystem, so don’t do that.
As an exercise, here’s one way to do it in a bash script:
declare -A lst=() # create empty associative array for f in * # for each file do ts=$( stat -c "%W" "$f" ) # get the creation timestamp lst[$ts]="$f" # use timestamp as key, name as value done last=( $( # create simple array with subshell output printf "%sn" ${!lst[@]} | # print single column of timestamp keys sort -rn | # reverse numerical sort head -5 # take the most recent 5 ) ) for ts in "${last[@]}"; do rm "./${lst[$ts]}"; done # delete 5 most recent files
See why you might want to use python? 🙂