Page 1 of 1

Paid: Obtain PHASH of images

Posted: 2018-09-07T12:32:23-07:00
by kevinm40
Hi,

I currently have ImageMagick 7.0.8-10 Q16 x86_64 on an Amazon Linux box. I need to obtain a phash from ImageMagick which will be stored in a db for future comparison (via Hamming distance). I am after a script that will obtain & return the phash.

I understand that there may be several different ways of obtaining a phash from IM but I have no understanding of how they work or which is the best for my use case.

This will be used for establishing near-duplicates of product images where the cropping, image size and compression may be different, but the image is fundamentally the same. I have uploaded three images as examples: https://www.dropbox.com/sh/jjbyhh97n4lf ... e1sja?dl=0

In the example I would expect a close match for images 1 & 2, but not for image 3. I am not sure if it would skew results, but nearly all product images will have white backgrounds.

If anyone is interested and can help, please let me know.

Re: Paid: Obtain PHASH of images

Posted: 2018-09-07T13:17:06-07:00
by fmw42
ImageMagick has a perceptual hash. See viewtopic.php?f=4&t=24906. But the values for the hash are 42 floats.

I also have several perceptual hash methods in my script phashes at my link below. Each method creates a binary string hash that can be compared using the hamming distance for which I have a hamming script at my link below. The hashes can be stored in the image's meta data or exported and placed in a data base. My scripts are Unix bash shell scripts so they should work on Linux. See the pointers for use on my home page.