March 2015: csv-blobs improvements: windows binary, more speed, dramatically less memory
Connected component labeling is a relatively simple form of computer vision. The goal is to identify contiguous regions of pixels and note their outline, size, number or position.
In the simplest sense, how many blobs are in the image?
There is plenty of CCL software existing, see below for a list. But none of them filled my need for a small, standalone, fast and free library/utility. First and foremost, QuickBlobs was made to meet all those goals.
So how does QuickBlobs stack up? It is written in C and is extremely light weight, compiling to less than 5KB as a library or 15KB as an executable. The worst case memory use of the algorithm is O(sqrt(n)) and a 10 thousand pixel wide image needs (in theory) 600KB. QuickBlob is naturally fairly quick. My low end tablet 1.6GHz laptop can do a 4k by 4k image in just 0.6 0.4 seconds. Most of that time is spent decoding the image. Benchmarks suggest that a 640x480 x 30fps video stream with around 100 blobs in each frame could be processed in real time on a single 1GHz ARM core. Finally it is licensed LGPL, so you can use it freely in your applications.
How is it so fast? It uses a single pass algorithm based on sorted quadruply linked lists. Most other algorithms work one pixel at a time, while QuickBlob operates on run-length-encoded line segments. It only holds at most a single RLE-ed row of data in memory at any given moment. This particular implementation allocates the maximum amount of memory possible from the start for best cache spatial locality.
Building the QuickBlob library should need almost nothing. It is very close to not even needing a libc. To use it as a library, you'll need to define a few stub functions for loading images and logging results. Take a look at
quickblob.h for a summary.
QuickBlob comes with a small example utility,
csv-blobs. This takes an image as input and produces a comma separated list of every blob found. The output includes the size of blob, the color, the centroid and the bounding box. The utility needs the DevIL library LibPNG library for image loading. While
csv-blobs has not been tested on windows, it should be no more difficult to install than DevIL. For those of you on windows, here is a small pre-compiled and statically linked version: quickblob-mingw32.zip On Linux or OSX, just run
make to build the library and utility.
For most common workloads, preprocessing with Imagemagick and the
csv-blobs utility should meet your needs. For more demanding uses link the library into your code. If you need help building QuickBlob for your platform or integrating it into your application or even just have a feature request, feel free to email me.
After some profiling, the best-case input (a blank page) runs twice as fast and the worst-case input (100% static) runs 25 times fater faster. There still remains room for improvement. It is single threaded but parts of the algorithm could be run in parallel. If you only need to track blobs of a single color it could be made several times faster and lighter.
- contour tracking
- concave splitting
- visibility attributes
- private structures
- make libc optional
- Dotcount proprietary, windows only, non-commericial
- RoboRelm proprietary, windows only, $150
- Insight Tool Kit Apache licensed, very large (550 pages of docs)
- cvBlobsLib BSD, based on OpenCV
- cvBlob LGPL, C++ library, based on OpenCV
- Blobscanner GPLv3, Processing Language
- Blob Tool Lisp based image processing toolkit
- Imagemagick is not too bad, see the Benchmarks for more about it.
- Plugins for Matlab, ImageJ and other similar large math/image suites. Often these plugins do not work with recent versions.
Of those other programs, Imagemagick is by far the easiest to try out. We'll be using the
lorem.png image as a test case. The image is 4000 pixels on a side and has around 500 blobs in it. Tests show the following:
csv-blobs-- 0.57 0.38 seconds, 20MB ram, 2.36MB ram
imagemagick-- 25 seconds, 695B ram, 695MB ram
Imagemagic is 65x slower and uses 290x more memory. But this was on a 64 bit platform. On a 32 bit platform,
csv-blobs uses only 1.1MB to process a 4k image. Imagemagick OOM-ed on that test, so I don't think it sees a similar benefit.
Csv-blobs is 40x faster and uses 35x less memory. If you subtract out the time spent loading the image (0.38 seconds) and the memory used to store the bitmap (16MB) then quickblob runs 130x faster and 170x lighter.
Imagemagick also tends to choke up easily. The
lorem.png image is antialiased but those benchmarks convert it to a 2-level bitmap first. What if we force both programs to use the full 256 level grayscale? This bumps the number of blobs up by 280x, to more than 140 thousand. Csv-blobs handles this without complaint. Memory use stays the same but runtime takes 4x 5x as long. Imagemagick completely breaks, with a "too many objects" error.