[magick-users] How to locate (x, y) coordinates of one sub-image within another?

Peter Valdemar Mørch swp5jhu02 at sneakemail.com
Thu Jul 6 01:34:35 PDT 2006


Gabe Schaffer magick-at-gabe.com |Lists| wrote:
> * Let's say you're looking for the string "ABC". You would build a
> table that would have 0 for A, -1 for B, -2 for C, and 3 for any other
> letter. When searching a string, you would look up the current letter
> in your table and skip ahead however many letters it says. That way
> you would only look at every 3rd letter, backtracking a bit if you
> find a letter that's a B or C, and searching letter-by-letter only if
> you hit an A.
> 
> For an image. basically you would build a table that says for any
> color in the image you're searching for, what the minimum X and Y
> distances are to the upper-left. I don't think I have time to write
> such a thing, but it shouldn't be too hard.

Now the thing is that I'll probably be looking a lot for things like the 
title of a window, the text on a button etc. So the colors in the 
sub-image are likely to be the most common colors on the desktop.

But I think your idea has substantial merrit. I could start with the 
first screenshot and investigate the frequence of colors in it. Then 
find the pixels in the sub-image that are least likely to occur in the 
screenshot and look for those first, so I don't waste time looking for 
the most common colors first. And then look for a pixel of another color 
at some offset so I don't waste time looking for e.g. 29 blue pixels 
next to each other, a rather common occurance. Ideally I should look for 
color transitions first, to be able to rule out false matches quickly.

The beginnings of an algorithm are appearing slowly on the horizon. I'm 
amazed, though, that this isn't something that is in the standard lib. 
Especially since lots of image experience and optimization seems 
appropriate given the above discussion.

This probably likely to be *much* faster in C, so maybe this should be 
implemented in C and then the appropriate perl bindings created... 
Anybody have experience with the speed difference between C and perl for 
  this sort of thing?

>> Background: I want to write an app to automate mouse operations in a
>> Windows GUI.
>>
>> I plan to use the perl module Win32::Screenshot that returns an
>> Image::Magick perl object...
> 
> If the program you are automating uses standard Windows controls, each
> button will be a window you can find and send a message to. The only
> time you would need something like searching screenshots is if you are
> using a program which draws its own buttons (like a Flash app).

The problem is, though, the that app we'd like to remote-control is 
running under Citrix. And with Citrix, the windows controls aren't given 
meaningful names, so sending a message to a "Citrix" window control 
isn't that straight forward.

And then we have Flash, Java, HTML...

Hence the image-analysis approach.

Peter

-- 
Peter Valdemar Mørch
http://www.morch.com


More information about the Magick-users mailing list