[magick-users] Find Duplicates

Illtud Daniel illtud.daniel at llgc.org.uk
Fri Jan 19 08:03:53 CST 2007


OOzy Pal wrote:

> Ok your code worked nine but I have to run multiple times (unknown) if
> the files have 3 copies or more.

Yes. I imagined you only had two copies.

> I have an idea, why don't I move the unique files like

Good idea if there are multiple copies of some images.

> md5sum * | sort | awk {'print $2 " " $1'}  | uniq -df 1| awk {'print $1'}
> 
> this will print only unique file names. I tired this but no help.

Leave out the 'd' flag on uniq. And if all your files are jpgs, add
a filter to md5sum or it'll try and process the 'duplicates' directory
we created earlier.

awk {'print $1'} is the same as cut -f 1 d ' ', but I think cut's faster
(it should be lighter than awk)

mkdir unique
md5sum *.jpg | sort | awk {'print $2 " " $1'}  | uniq -f 1 | \
cut -f1 -d ' '

Then wrap that in the for command, because:

> get error in mv command

Yes, cos you're trying to run:
mv duplicates file1 file2 file3

whilst mv is 'mv file1 file2 file3 <destination>', so you're the wrong
way round. You may be able to persuade xargs to do something different,
but I don't know it that well.

We're well offtopic for ImageMagick now, so I suggest that you reply
just to me from now on. (any IM readers wanting to know how the exiting
tale concludes, email me...)




More information about the Magick-users mailing list