perceptualHash() not documented in API

Magick++ is an object-oriented C++ interface to ImageMagick. Use this forum to discuss, make suggestions about, or report bugs concerning Magick++.
Locked
terraformer
Posts: 3
Joined: 2020-02-15T12:08:20-07:00
Authentication code: 1152

perceptualHash() not documented in API

Post by terraformer »

perceptualHash() is in the API but not documented (at https://www.imagemagick.org/api/Magick++/index.html). I wanted to check in and find out if this is intentional (as in it's not officially supported, etc; ) or just an oversight. Also, is the return value stable as to what will be returned from it? I ran it on two identical jpegs, one with a slight color variation and it's hamming distance was 405. Totally different files had HDs of 450-500 so this seemed off to me. Before I ran this on a large set I wanted to see if this is expected.

ETA: I am running this on a Mac using Xcode using 7.0.9-21 in case it matters but I am not having issues running it so don't know if anyone cares about the details of the environment.

User avatar
fmw42
Posts: 26383
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: perceptualHash() not documented in API

Post by fmw42 »

Does it work as expected in command line mode? Have you seen http://www.fmwconcepts.com/misc_tests/p ... index.html

terraformer
Posts: 3
Joined: 2020-02-15T12:08:20-07:00
Authentication code: 1152

Re: perceptualHash() not documented in API

Post by terraformer »

I didn't try it in command line mode because I was, and still am, assuming it is likely producing correct results. I did see that page, but that's one reason why I am asking about the documentation. The test results on that page show non self matching of 3.7 or 5.4, etc; but it doesn't specify units. And the units discussed at that page are nowhere near 404. I just ran the command line version and I still confused by the output. The output of the command line is in percentage units (value of 30.1%) between those two files. A hamming distance of 404* between those two files is ~47% of the 844 (or 840, the column may be offset by one in my editor) bits represented by the output of perceptualHash(). So that's what I wanted more info on. What is the output represent? Is hamming distance not the best means for this comparison? Also, given this is not in the Docs, is this function not settled yet? ie; may it change in the future, etc;?

FYI: I am not doing sub image searches. I am working on applying this to complete images only.

* Run this in mysql for the hamming distance SELECT BIT_COUNT(CONVERT("ae41e84dd18b4f08b547625568e61662483b850aa1f9a8286584d4289ff5877818889bac391847478a8f38a4b7622d18cd9c62149a1c54823e98784f88d1f61bbd8b0f361bceaa966841f58a0cf89bfb620908c28161fb3acb3e848d58abe98a909623618d3dc62220" USING BINARY) ^
CONVERT("ac6958476789e6f89bf3620ac8c7c361f81b8801ce9168290784d1d8acc2862528884da996e83e28891c38859761d228a9aa61b6fa2ff582878877cf895ad61f098ab0761c7ca71d5835f48859487771619d9897778f828aa44e840698949188b6a61df58b16a61c7a" USING BINARY)) AS "copy"

User avatar
fmw42
Posts: 26383
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: perceptualHash() not documented in API

Post by fmw42 »

The perceptual hash used does not conform to hamming distance. It uses a sum squared difference metric since all the 42 values are floats.

This perceptual hash metric is pretty much settled other than any potential bug issues. But other metrics that conform to using a hamming distance might be implemented in the future. See my bash unix shell script, phashes, at my link below.

The link to the reference paper on the current current IM perceptual hash describes the algorithm in more detail.

terraformer
Posts: 3
Joined: 2020-02-15T12:08:20-07:00
Authentication code: 1152

Re: perceptualHash() not documented in API

Post by terraformer »

OK, thanks for the info.

Locked