[magick-users] extracting text area from image
Alexandru Ciobanu
capsunel at gmail.com
Mon Jun 11 06:54:19 PDT 2007
Hi, Ron!
I need (1), i.e extract an image which is the same size as the text area.
I will use a dedicated tool for OCR (2), which is the next step.
Basically this must be a primitive implementation of layout analysis. =)
And the image is here (I thought it'll make it attached):
http://picasaweb.google.com/capsunel/Imaging/photo#5074517723571163346
Note: the red area is not really important.
Alex
PS: I've posted the same question here:
http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=8949
On 6/10/07, Ron Savage <ron at savage.net.au> wrote:
>
> Alexandru Ciobanu wrote:
>
> Hi Alexandru
>
> > I am trying to use ImageMagick to extract strictly
> > the text area from a photograph of a book page.
>
> Do you mean
>
> (1) extract an image which is the same size as the text area, or
> (2) extract the text letter-by-letter
>
> The latter is called Optical Character Recognition, and I do not know of
> any such feature within IM.
>
> > If you look at the image attached, I am interested in the
> > green area and, if possible, the red area.
>
> No image attached. Please upload to your web site.
>
> > The problem is that it has to be automated and work
> > for books of various sizes.
>
> Sure.
>
> > My idea so far, is:
> > apply a really crazy filter that would transform the
> > green area into o big uniform blob, so that I can
> > then extract its coordinates, and then use those
> > on the original image.
>
> Sounds reasonable, but also sounds like (1) above.
> --
> Ron Savage
> ron at savage.net.au
> http://savage.net.au/
> _______________________________________________
> Magick-users mailing list
> Magick-users at imagemagick.org
> http://studio.imagemagick.org/mailman/listinfo/magick-users
>
More information about the Magick-users
mailing list