Page 1 of 1

Creating an image of first page of PDF - jagged text.

Posted: 2018-11-21T11:50:35-07:00
by atani
Hi all,

I've been using Image::Magick (v6.6.0) to create a thumbnail JPG image of the first page of a PDF with good results for years. Now I'd like to upgrade the base version of Imagemagick (v6.77) but when doing so the resulting image has very jagged text. The same is true when comparing the results of the simple use of the "convert" tool - now I'm getting jagged text.

I've found some solutions online that use the "Super Sampling" approach and that approach works fine when applied to the "convert" command line tool but I can't find the equivalent perl-magick operations.

Previously the code was simple - Read page1 of the PDF and write it as a JPG (simplified for brevity):

Code: Select all

my $magick = Image::Magick->new();
$magick->Read("pdf:$thefile[0]");
$magick->Write("jpg:$outfile");
Which is like the simple command line

Code: Select all

convert 'myfile.pdf[0]' outfile.jpg
And I can use the Super Sampling approach with v6.77 to get good results like this on the command line:

Code: Select all

convert 'myfile.pdf[0]' -alpha remove -units PixelsPerInch -support 1.1 -resample 72 outfile.jpg
I can't seem to figure out is how to remove the Alpha channel via Perl::Magick, which seems to be a critical piece of the above CLI invocation - without it I end up with an almost entirely black image after the resample (which is also something I need to do to cut down on the image size).

I've tried various things like setting "alpha" to "Off" and/or "matte" to "False" and "alpha" to "Remove" (even though this does appear as an option in the perl-magick docs) but for all of them, when I resample the density of the image to 72 I end up with an all black image.

Example:

Code: Select all


$pdf_magick->Set(units => 'PixelsPerInch');
$pdf_magick->Set(density=>300);

my $rc = $pdf_magick->Read("pdf:${filename_to_read}[0]");

my $err = defined($rc) ? $rc : $@;
if ($err) {
    say STDERR "error reading PDF: $error_str";
}

$pdf_magick->Set(alpha => 'Off');
$pdf_magick->Resample(density => 72, support=>1.1);
$pdf_magick->Write("jpg:$outfile_name");
How can I remove the alpha channel via perl-magick? Or does anyone have any other pointers to grabbing the first page as a jpeg while preserving smooth text?

Any help is much appreciated!
-Joel

Re: Creating an image of first page of PDF - jagged text.

Posted: 2018-11-21T13:17:07-07:00
by snibgo
PDF documents are often opaque black text on transparent black background. For those documents, turning alpha off gives you an opaque black image.

A better method is to set "-background White", then "-layers flatten". I don't know how to do that in Perl.

Re: Creating an image of first page of PDF - jagged text.

Posted: 2018-11-28T10:30:21-07:00
by atani
Thanks for that recommendation. Is it possible for the "-background White" operation to change a PDF's existing background from some other color to White - i.e. can a PDF have a background color other than transparent or White (the file I'm currently using apparently has a white background already, apparently).

Re: Creating an image of first page of PDF - jagged text.

Posted: 2018-11-28T11:18:30-07:00
by snibgo
IM is a processor for raster images. Once you have a raster image, it can change white pixels to any colour you want. But that's not what you want to do.

Ghostscript might have an command-line option to do that.