Skip to content
Advertisement

Recursively delete all but the one newest file throughout all directories

My system depends on only having one file (PDF, DOCX) per subdirectory. There are thousands and thousands of subdirectories. Due to a permission error, in some of them, I have ended up with more than one file. In these instances, I only want to keep the one most recently modified file.

I was able to export a list of directories that contain more than one file successfully:

find . -type f -printf '%hn' | sort | uniq -d >test.txt

So I end up with a nice list of all those directories that I need to look at. But it’s rather long.

I was also able to automate the deletion of everything but the most recently modified file in a directory:

ls -t | tail -n +2 | xargs -d 'n' rm -f

That does remove all files but the most recently modified one.

The problem I am running into is that the second command only works within that directory. I have not figured out a way to apply it recursively to all directories.

I have attempted:

find /data/test/CONTAINER/SANDBOX -type f -exec sh -c 'ls -t | tail -n +2 | xargs -d 'n' rm -f ' {} ;

but that just yielded xargs: argument line too long

I have tried to adjust the xargs parameters, but I am sure there must be a better way to perform this? Perhaps a shell scrip that pipes the test.txt file fo the folders to cd into and then perform command two in each of these? Or simply a way to recursively apply command 2 to all subfolders, regardless of how many files are contained within that folder?

The last thing I was thinking of is that perhaps the command 3 I had tried applies from the main directory, where I have hundreds of thousands of directories, no wonder the argument line could be too long – but -mindepth 2 didnt change a thing.

Thank you

Advertisement

Answer

I think the following script should do the trick for you.

#!/bin/bash

DIR_TO_FIND="/path/to/dir"

find "$DIR_TO_FIND" -type d | while read -r DIR; do
    cd "$DIR"
    ls -t | tail -n +2 | xargs -d 'n' rm -f
    cd "$DIR_TO_FIND" 
done
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement