More performance gains!

Original Adventures in connected component labeling More performance gains! Editable
version 2 of 2



Connected component labeling is a relatively simple form of computer vision. The goal is to identify contiguous regions of pixels and note their outline, size, number or position.

In the simplest sense, how many blobs are in the image?

There is plenty of CCL software existing, see below for a list. But none of them filled my need for a small, standalone, fast and free library/utility. First and foremost, QuickBlobs was made to meet all those goals.

 


4k test image, with white and black areas marked

So how does QuickBlobs stack up? It is written in C and is extremely light weight, compiling to less than 5KB as a library or 15KB as an executable. The worst case memory use of the algorithm is O(sqrt(n)) and a 10 thousand pixel wide image needs (in theory) 600KB. QuickBlob is naturally fairly quick. My low end tablet can do a 4k by 4k image in just 0.6 seconds. Most of that time is spent decoding the image. Benchmarks suggest that a 640x480 x 30fps video stream with around 100 blobs in each frame could be processed in real time on a single 1GHz ARM core. Finally it is licensed LGPL, so you can use it freely in your applications.

How is it so fast? It uses a single pass algorithm based on sorted quadruply linked lists. Most other algorithms work one pixel at a time, while QuickBlob operates on run-length-encoded line segments. It only holds at most a single RLE-ed row of data in memory at any given moment. This particular implementation allocates the maximum amount of memory possible from the start for best cache spatial locality.

Quick Start

Building the QuickBlob library should need almost nothing. It is very close to not even needing a libc. To use it as a library, you'll need to define a few stub functions for loading images and logging results. Take a look at quickblob.h for a summary.

QuickBlob comes with a small example utility, csv-blobs. This takes an image as input and produces a comma separated list of every blob found. The output includes the size of blob, the color, the centroid and the bounding box. The utility needs the DevIL library for image loading. While csv-blobs has not been tested on windows, it should be no more difficult to install than DevIL. On Linux or OSX, just run make to build the library and utility.

For most common workloads, preprocessing with Imagemagick and the csv-blobs utility should meet your needs. For more demanding uses link the library into your code. If you need help building QuickBlob for your platform or integrating it into your application or even just have a feature request, feel free to email me.

After some profiling, the best-case input (a blank page) runs twice as fast and the worst-case input (100% static) runs 25 times fater. There still remains room for improvement. It is single threaded but parts of the algorithm could be run in parallel. If you only need to track blobs of a single color it could be made several times faster and lighter.

Todo

  • contour tracking
  • concave splitting
  • visibility attributes
  • private structures
  • make libc optional

Other Applications:

  • Dotcount proprietary, windows only, non-commericial
  • RoboRelm proprietary, windows only, $150
  • Insight Tool Kit Apache licensed, very large (550 pages of docs)
  • cvBlobsLib BSD, based on OpenCV
  • cvBlob LGPL, C++ library, based on OpenCV
  • Blobscanner GPLv3, Processing Language
  • Blob Tool Lisp based image processing toolkit
  • Imagemagick is not too bad, see the Benchmarks for more about it.
  • Plugins for Matlab, ImageJ and other similar large math/image suites. Often these plugins do not work with recent versions.

Benchmarks

Of those other programs, Imagemagick is by far the easiest to try out. We'll be using the lorem.png image as a test case. The image is 4000 pixels on a side and has around 500 blobs in it. Tests show the following:

  • csv-blobs -- 0.57 seconds, 20MB ram
  • imagemagick -- 25 seconds, 695B ram

Csv-blobs is 40x faster and uses 35x less memory. If you subtract out the time spent loading the image (0.38 seconds) and the memory used to store the bitmap (16MB) then quickblob runs 130x faster and 170x lighter.

Imagemagick also tends to choke up easily. The lorem.png image is antialiased but those benchmarks convert it to a 2-level bitmap first. What if we force both programs to use the full 256 grayscale? This bumps the number of blobs up by 280x, to more than 140 thousand. Csv-blobs handles this without complaint. Memory use stays the same but runtime takes 4x as long. Imagemagick completely breaks, with a "too many objects" error.