GNU Astronomy Utilities manual

Next: , Previous: , Up: Threads in GNU Astronomy Utilities   [Contents][Index]


4.3.1 A note on threads

Spinning off threads internally is not necessarily always the most efficient way to run an application. Creating a new thread isn’t a cheap operation for the operating system. It is most useful when the input data are fixed and you want the same operation to be done on parts of it. For example one input image to ImageCrop and multiple crops from various parts of it. In this fashion, the image is loaded into memory once, all the crops are divided between the number of threads internally and each thread cuts out those parts which are assigned to it from the same image. On the other hand, if you have multiple images and you want to crop the same region out of all of them, it is much more efficient to set --numthreads=1 (so no threads spin off) and run ImageCrop multiple times simultaneously, see How to run simultaneous operations.

You can check the boost in speed by first running a program on one of the data sets with the maximum number of threads and another time (with everything else the same) and only using one thread. You will notice that the wall-clock time (reported by most programs at their end) in the former is longer than the latter divided by number of physical CPU cores available to your operating system. Asymptotically these two can be equal (most of the time they aren’t). So limiting the programs to use only one thread and running them independently on the number of available threads will be more efficient.

Note that the operating system keeps a cache of recently processed data, so usually, the second time you process an identical dataset (independent of the number of threads used), you will get faster results. In order to make an unbiased comparison, you have to first clean the system’s cache with the following command between the two runs.

$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches

SUMMARY: Should I use multiple threads? Depends:

  • If you only have one data set (image in most cases!), then yes, the more threads you use (with a maximum of the number of threads available to your OS) the faster you will get your results.
  • If you want to run the same operation on multiple data sets, it is best to set the number of threads to 1 and use GNU Parallel as explained above.

Next: , Previous: , Up: Threads in GNU Astronomy Utilities   [Contents][Index]


Read in other formats.
GNU Astronomy Utilities manual, November 2015.