Skip to content
Advertisement

bash / cmp: compare two consecutive jpg. files with same size of a long list

I’m trying to apply the cmp command to a number of consecutive jpg files with same size but different name, in order to make sure they are the indeed same. Since there are almost 4000 files, I would like to create a for loop through them with cmp and produce a final output with the list of actual same files, but so far I haven’t been able to.

This is a sample of the file list:

-rw-r--r-- 1 giu_  1094433 dic 30 09:12 IMG_0199.JPG  
-rw-r--r-- 1 giu_  1094433 lug 30  2016 img_0199_28043673584_o.jpg  
-rw-r--r-- 1 giu_  1124837 dic 30 09:12 IMG_0103.JPG  
-rw-r--r-- 1 giu_  1124837 lug 30  2016 img_0103_28045527533_o.jpg  
-rw-r--r-- 1 giu_  1174143 ago 12  2016 img_1520_28906930111_o.jpg  
-rw-r--r-- 1 giu_  1174143 dic 30 12:33 IMG_1520.JPG  
-rw-r--r-- 1 giu_  1227753 dic 30 09:12 IMG_0104.JPG  
-rw-r--r-- 1 giu_  1227753 lug 30  2016 img_0104_28044608674_o.jpg  

Advertisement

Answer

Unless this is a coding exercise (in which case my recommendation is not applicable), look into fdupes. It does exactly what you want.

FDUPES(1)                         General Commands Manual                        FDUPES(1)

NAME
       fdupes - finds duplicate files in a given set of directories

SYNOPSIS
       fdupes [ options ] DIRECTORY ...

DESCRIPTION
       Searches the given path for duplicate files. Such files are found by comparing file
       sizes and MD5 signatures, followed by a byte-by-byte comparison.
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement