I’m trying to apply the cmp
command to a number of consecutive jpg files with same size but different name, in order to make sure they are the indeed same. Since there are almost 4000 files, I would like to create a for
loop through them with cmp
and produce a final output with the list of actual same files, but so far I haven’t been able to.
This is a sample of the file list:
-rw-r--r-- 1 giu_ 1094433 dic 30 09:12 IMG_0199.JPG -rw-r--r-- 1 giu_ 1094433 lug 30 2016 img_0199_28043673584_o.jpg -rw-r--r-- 1 giu_ 1124837 dic 30 09:12 IMG_0103.JPG -rw-r--r-- 1 giu_ 1124837 lug 30 2016 img_0103_28045527533_o.jpg -rw-r--r-- 1 giu_ 1174143 ago 12 2016 img_1520_28906930111_o.jpg -rw-r--r-- 1 giu_ 1174143 dic 30 12:33 IMG_1520.JPG -rw-r--r-- 1 giu_ 1227753 dic 30 09:12 IMG_0104.JPG -rw-r--r-- 1 giu_ 1227753 lug 30 2016 img_0104_28044608674_o.jpg
Advertisement
Answer
Unless this is a coding exercise (in which case my recommendation is not applicable), look into fdupes
. It does exactly what you want.
FDUPES(1) General Commands Manual FDUPES(1) NAME fdupes - finds duplicate files in a given set of directories SYNOPSIS fdupes [ options ] DIRECTORY ... DESCRIPTION Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison.