Page 2 of 2

Re: PDF to image causes 1st page margin

Posted: 2013-08-12T13:35:36-07:00
by BigLittle
I've found that the height of the document determines the angle of the watermark, as well as the size. I wonder if it's possible to pattern match the watermark, and then remove it? It looks like that is what you were talking about with morphology. I don't understand it, but I'm going to go through the http://www.imagemagick.org/Usage/morphology/ and try to figure it out.

Re: PDF to image causes 1st page margin

Posted: 2013-08-12T15:08:22-07:00
by BigLittle
Removing the majority of it seems reasonable actually. By looking at convert subimage-search and possibly morphology, it seems doable. I've toyed around with it but I can't get it to work.

I'd pay you (or anyone) who could figure out how to mostly remove it so OCR would work good. I attached the pattern, and several documents. The 11-0.png document has an exact match, while the others might be slightly different which is the biggest challenge.

Re: PDF to image causes 1st page margin

Posted: 2013-08-12T15:53:45-07:00
by fmw42
You might be able to match the watermark only image to the image with text and watermark by using compare -subimage-search. Then you need to use -compose subtract or -compose divide to remove the watermark. morphology open or close will only try to remove small dots. That did not work well for me when I tested that.

Re: PDF to image causes 1st page margin

Posted: 2013-08-12T20:52:15-07:00
by BigLittle
I tried Imagick compare, but got an error:

Code: Select all

$Page = new Imagick('Result-0.png');
$Page2 = new Imagick('SearchPatternPNG.png');
$Result = $Page2 -> compareImages($Page, Imagick::COMPOSITE_SATURATE);
$Result[0] -> setImageFormat('jpeg');
echo $Result[0];

Code: Select all

Fatal error: Uncaught exception 'ImagickException' with message 'Compare images failed' in /home/pitmanco/public_html/la/ndrin/search.php:9 Stack trace: #0 /home/pitmanco/public_html/la/ndrin/search.php(9): Imagick->compareimages(Object(Imagick), 44) #1 {main} thrown in /home/pitmanco/public_html/la/ndrin/search.php on line 9
I printed the images and they both show, but when comparing it fails. The error isn't very helpful though. I also tried CL and I didn't get any response or any files created when I did:
compare -subimage-search /fullpath/Result-0.png /fullpath/SearchPatternPNG.png /fullpath/ZZZ.png



Edit: I did run "compare -subimage-search /path/Result-0.png /path/SearchPatternPNG.png /path/ZR-%d.png" which did execute and used a great amount of server resources which then returned no image.

Re: PDF to image causes 1st page margin

Posted: 2013-08-12T21:46:41-07:00
by fmw42
try setting a -metric rmse (or some other metric). Also note that the for subimage-search, the two images must be different sizes (larger first)

compare -metric rmse -subimage-search largeimage smallimage resultimages

if you are running it via PHP exec(), you will likely need to send the result from stderr to stdout

compare -metric rmse -subimage-search largeimage smallimage resultimages 2>&1

see
http://www.imagemagick.org/Usage/compare/
http://www.imagemagick.org/Usage/compare/#statistics
http://www.imagemagick.org/script/compare.php

I do not know much about doing compare in Imagick. But it does work in command line.

see the following old example, but it now needs the addition of -subimage-search
viewtopic.php?f=1&t=14613&p=51076&hilit ... ric#p51076

Re: PDF to image causes 1st page margin

Posted: 2013-08-13T10:22:18-07:00
by BigLittle
I tried it with an example photo which worked. For some reason, the search pattern and search image attached will run till the server kills it. I attached them (SearchImage.jpg//SearchPattern.png). I tried it with a small version (attached SearchImageZ/SearchPatternZ) which returned the error: images too dissimilar `/SearchImageZ.jpg' @ error/compare.c/CompareImageCommand/953.

I'll be trying different patterns to see if something works. Any thoughts what I'm doing wrong?

Re: PDF to image causes 1st page margin

Posted: 2013-08-13T10:48:47-07:00
by fmw42
IM compare is set up for normal type images and will stop if the images are too dissimilar. So add to the command -dissimilarity-threshold 100%. That should keep it from stopping too quickly. If you want to speed it up, you can also add -similarity-threshold somesmallvalue, if you use -metric rmse. It will then stop when it reaches a match that has a metric value smaller than or equal to you somesmallvalue. If you know you have a perfect match you can use somesmallvalue=0 (in quantumrange --- 65535 for Q16 compile or 256 in Q8 compile) or 0% (in range 0 to 100). So that value can be absolute or a percent. If you do not believe the match will be perfect, that raise the value to something bigger than 0 but still small or it will stop at a close but not optimum match. Otherwise, just wait for it to finish when leaving off the -similarity-threshold

see
http://www.imagemagick.org/script/comma ... -threshold
http://www.imagemagick.org/script/comma ... -threshold