Will pay a $$$ bounty to fix threading issue

Questions and postings pertaining to the development of ImageMagick, feature enhancements, and ImageMagick internals. ImageMagick source code and algorithms are discussed here. Usage questions which are too arcane for the normal user list should also be posted here.
Post Reply
RNHurt
Posts: 6
Joined: 2011-08-16T12:09:00-07:00
Authentication code: 8675308

Will pay a $$$ bounty to fix threading issue

Post by RNHurt »

My employeer is trying to utilize RMagick in a Rails application and we are getting bit by the OpenMP / ImageMagick threading issue. Its the same old story that's been told in these pages before; processing one image is fast, processing 100 images is slow and the "solution" is to turn off OpenMP. Well, we think there is a better way and we're willing to pay for it!

Here's the situation. We have a fairly standard Rails 3.1 app that is focused on image manipulation and we're using ImageMagick 6.6.2 & RMagick 2.13.1 to get the job done. Under Linux (CentOS 5.4 / Ubuntu 11.04 / OS X Lion) with OpenMP compiled into the system we can render a single image in 0.5ms. Trying to process 10-100 simultaneously causes the CPUs to completely load up and the IM performance plummets. I want to be able to keep threading enabled and scale the number of simultaneous image processes linearly with the number of CPU cores available.

Here are the rules:
  • the code must be integrated back into the ImageMagick code base. This ensures that the code is proper and well constructed and that it will be maintained and used by others.
  • the code must be generic and multi-platform. You can't "cheat" and only speed up certain processes when run on 8-way Amazon AWS machines running Linux 3.1 (for example).
  • the performance jump must be significant. I don't have exact numbers but I would expect to at least double the performance of IM over the current implementation on multiple cores. A 1-2% performance increase is not going to cut it.
How much $$$ are we talking about? I'm thinking about several hundred dollars. Obviously, if this is a 6-month rebuild of the IM core then we probably can't do it. However, if it's a couple of nights with the compiler then you could walk away with some serious extra $$$ to use on that new Das Keyboard you've been eyeing. :)

If this looks like something you would be interested in, respond to this message and let me know what you think.

Thanx!
Richard
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Will pay a $$$ bounty to fix threading issue

Post by magick »

The ImageMagick developers periodically look for ways to increase performance of the MagickCore API but we're reaching diminishing returns. OpenMP is great if you're on a quiet system with multiple cores. Assume a 4 core system. OpenMP will process 4 scanlines of the image in parallel and then move onto the next batch. Much of the speed comes from the memory being near-by and readily available to the cache. Once you add additional processes, you have contention for the CPU and the cache must be cleared to process each task in more or less a round robin fashion. Distributed computing could be an option but the overhead of farming out the task to a remote computer offsets any gains. A GPU could speed up certain algorithms (e.g. image convolution) but the overhead of the JIT compiler (e.g. OpenCL) and getting memory transferred from the host to the GPU again offsets any possible gains. And then again there would be contention for the GPU (one task at a time). The most promising speed-up might come from a FPGA but we have not researched that solution yet.

If we're missing something in terms of a possible performance gain for ImageMagick, we would like to hear about it. We'll add to the $$$ offering here as an incentive.
RNHurt
Posts: 6
Joined: 2011-08-16T12:09:00-07:00
Authentication code: 8675308

Re: Will pay a $$$ bounty to fix threading issue

Post by RNHurt »

Thank you very much for the information on how IM threads and processes images. It really helps me understand whats going on and why this is such a difficult problem to solve. After talking with the my people I think our best road forward is to build a pool of simple Rails + IM boxes and load balance requests between them. This way we can process individual images very quickly and still have the throughput numbers that we need.

Given this information I am willing to extend this bounty for 6 months from todays date. If anyone can solve this problem in a better way within the time limit then we will gladly honor the bounty and pay out the reward. Good luck! :)

Thank you,
Richard
NicolasRobidoux
Posts: 1944
Joined: 2010-08-28T11:16:00-07:00
Authentication code: 8675308
Location: Montreal, Canada

Re: Will pay a $$$ bounty to fix threading issue

Post by NicolasRobidoux »

Why it is often better to turn off OpenMP when there are more tasks than cores:

The logic behind this suggestion is that multicore computing adds overhead. In theory, by turning off OpenMP, each task will live on one single core, "simplifying" each task. If, however, you have a lot of tasks, your multiple cores will be busy even if you don't allow processes to be distributed over multiple cores. Consequently, there is no need to distribute the load to keep all the cores busy since they are busy already. And, really, this is the one benefit of OpenMP.

In addition, each task will most likely be at a different stage of completion, which means that it will need different resources (load from disk, compute, write to disk...) so that your tasks are more likely to be executed without clobbering each other with requests for the same resources.

Analogy: If you have one task and a lot of workers, it makes sense to assign them all to the one task. They'll get in each other's way, but if they are managed sufficiently well they'll still complete the one task faster than if only one worker was assigned to it.

If you have a lot of tasks, however, you are better off having one task per worker, letting each of them take on a new task without having to consult the other workers.

-----

Whether this works depends on the memory footprint of the individual tasks. If the memory footprint is sufficiently small, the above holds. If, however, the combined memory footprint is so large that only one task can be fully loaded at one time, you actually are better off having all your cores work on one task.

Analogy: If each task requires a big machine and there is only one such available at any time, it is better to have all workers work on one single task even if there are lots of tasks. However, if each task only requires a hammer and nails, one task per worker is probably best.

-----

If you give this a try, please let us know how it turns out.
Last edited by NicolasRobidoux on 2011-09-12T19:05:34-07:00, edited 2 times in total.
NicolasRobidoux
Posts: 1944
Joined: 2010-08-28T11:16:00-07:00
Authentication code: 8675308
Location: Montreal, Canada

Re: Will pay a $$$ bounty to fix threading issue

Post by NicolasRobidoux »

And if I may suggest a way to speed things up: Use SSE/Altivec/...

Orc http://code.entropywave.com/documentati ... cepts.html is apparently a reasonable way to access the vector units without the trappings of multiple instruction sets.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Will pay a $$$ bounty to fix threading issue

Post by anthony »

magick wrote:Distributed computing could be an option but the overhead of farming out the task to a remote computer offsets any gains.
With IMv7 and a re-write of its Command Line Interface to read and process options from pipelines, distributed computing may be a much more feasible proposition.

That is you can continuously run the equivalent of a convert command on each remote machine connected by a network pipe, and send operations to perform to the farm of processes from a controlling script/program. It is a technique called co-processing and in some ways is like what FastCGI did in the early days of the web.

My work on this re-development of the CLI interface should start in a couple of weeks. It is my number one goal at this time as it will remove the current need to run multiple convert commands to do image processing.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
RNHurt
Posts: 6
Joined: 2011-08-16T12:09:00-07:00
Authentication code: 8675308

Re: Will pay a $$$ bounty to fix threading issue

Post by RNHurt »

The new IMv7 command line interface sounds like it could make a render farm much more palatable. Are there any IM roadmaps? I know open source software doesn't really have timelines but I'm still curious as to when this might be available.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Will pay a $$$ bounty to fix threading issue

Post by magick »

See http://www.imagemagick.org/script/porting.php for the ImageMagick version 7 porting guide. Anthony will add a section on the command line changes within a month or so.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Will pay a $$$ bounty to fix threading issue

Post by anthony »

I am being very caution with CLI changes at the moment, while I work out the low level plan of attack.

It currently looks however like I'll be integrating CLI handling more closely with the MagickWand interface, which will probably add more MagickWand functions as well.

However nothing will appear until I am satisfied I am not doing to seriously break something. I do not want to leave IMv7 in a broken state for a long period (this is a major re-write) so I am dooing much work off line until I have it working.

My notes as to what I want the new interface to do is at
http://www.imagemagick.org/Usage/bugs/I ... ipting.txt
and percent escape handling expansion
http://www.imagemagick.org/Usage/bugs/I ... rcent.html

That latter will save the need for the calling/wrapper script from actually need a lot of image information when it will just feed it straight back into IM anyway. (long time IMv6 wish)

It will appear in 'porting' when I have things more concrete.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
Post Reply