csv-blobs improvements: windows binary, more speed, dramatically less memory

Original Adventures in connected component labeling More performance gains! csv-blobs improvements: windows binary, more speed, dramatically less memory Editable
version 3 of 3



 

Connected component labeling is a relatively simple form of computer vision. The goal is to identify contiguous regions of pixels and note their outline, size, number or position.

Add that QuickBlob only does CCL and not feature extraction or mot...


 

In the simplest sense, how many blobs are in the image?

There is plenty of CCL software existing, see below for a list. But none of them filled my need for a small, standalone, fast and free library/utility. First and foremost, QuickBlobs was made to meet all those goals.

 


4k test image, with white and black areas marked

So how does QuickBlobs stack up? It is written in C and is extremely light weight, compiling to less than 5KB as a library or 15KB as an executable. The worst case memory use of the algorithm is O(sqrt(n)) and a 10 thousand pixel wide image needs (in theory) 600KB. QuickBlob is naturally fairly quick. My low end 1.6GHz laptop can do a 4k by 4k image in just 0.4 seconds. Benchmarks suggest that a 640x480 x 30fps video stream with around 100 blobs in each frame could be processed in real time on a single 1GHz ARM core. Finally it is licensed LGPL, so you can use it freely in your applications.

How is it so fast? It uses a single pass algorithm based on sorted quadruply linked lists. Most other algorithms work one pixel at a time, while QuickBlob operates on run-length-encoded line segments. It only holds at most a single RLE-ed row of data in memory at any given moment. This particular implementation allocates the maximum amount of memory possible from the start for best cache spatial locality.

Quick Start

Building the QuickBlob library should need almost nothing. It is very close to not even needing a libc. To use it as a library, you'll need to define a few stub functions for loading images and logging results. Take a look at quickblob.h for a summary.

 

QuickBlob comes with a small example utility, csv-blobs. This takes an image as input and produces a comma separated list of every blob found. The output includes the size of blob, the color, the centroid and the bounding box. The utility needs the LibPNG library for image loading. For those of you on windows, here is a small pre-compiled and statically linked version: quickblob-mingw32.zip On Linux or OSX, just run make to build the library and utility.


 

For most common workloads, preprocessing with Imagemagick and the csv-blobs utility should meet your needs. For more demanding uses link the library into your code. If you need help building QuickBlob for your platform or integrating it into your application or even just have a feature request, feel free to email me.

After some profiling, the best-case input (a blank page) runs twice as fast and the worst-case input (100% static) runs 25 times faster. There still remains room for improvement. It is single threaded but parts of the algorithm could be run in parallel. If you only need to track blobs of a single color it could be made several times faster and lighter.

Todo

  • contour tracking
  • concave splitting
  • visibility attributes
  • private structures
  • make libc optional

 

Other Applications:

  • Dotcount proprietary, windows only, non-commericial
  • RoboRelm proprietary, windows only, $150
  • Insight Tool Kit Apache licensed, very large (550 pages of docs)
  • cvBlobsLib BSD, based on OpenCV
  • cvBlob LGPL, C++ library, based on OpenCV
  • Blobscanner GPLv3, Processing Language
  • Blob Tool Lisp based image processing toolkit
  • Imagemagick is not too bad, see the Benchmarks for more about it.
  • Plugins for Matlab, ImageJ and other similar large math/image suites. Often these plugins do not work with recent versions.

http://www.cs.cmu.edu/~jbruce/cmvision/


 

Benchmarks

 

Of those other programs, Imagemagick is by far the easiest to try out. We'll be using the lorem.png image as a test case. The image is 4000 pixels on a side and has around 500 blobs in it. Tests show the following:

Memory measurements performed with GNU Time.


 

  • csv-blobs -- 0.38 seconds, 2.36MB ram
  • imagemagick -- 25 seconds, 695MB ram

Imagemagic is 65x slower and uses 290x more memory. But this was on a 64 bit platform. On a 32 bit platform, csv-blobs uses only 1.1MB to process a 4k image. Imagemagick OOM-ed on that test, so I don't think it sees a similar benefit.

Imagemagick also tends to choke up easily. The lorem.png image is antialiased but those benchmarks convert it to a 2-level bitmap first. What if we force both programs to use the full 256 level grayscale? This bumps the number of blobs up by 280x, to more than 140 thousand. Csv-blobs handles this without complaint. Memory use stays the same but runtime takes 5x as long. Imagemagick completely breaks, with a "too many objects" error.