Friday, 11 January 2013

Fast Pattern Features

Inventing really fast pattern matching algorithms is good fun. 

I've recently been looking at methods to accelerate one of my pattern matching algorithms and thought I'd share some of my techniques with you.  It takes alot of precomputation, but then alignment is really fast.  The algorithm is intended for registration of a high resolution template to a similar target image, for applications when some alignment is required but not a full image search.

Feature Patches

Performing image correlation is slow, but it can be accelerated by not using every pixel.  The algorithm I'm using splits a high resolution trained template image into multiple small feature patches. Only the best patches that have the highest structure tensor are then used for alignment (Figure 1).  The reduction of the image to its key features reduces the amount of correlation required significantly.

Patches that are close together are pruned, leaving the more widely spaced patches. We don't want all our patches clustered together since that would make determination of global rotation or scaling more sensitive to local distortions.  This patch feature extraction dramatically cuts down the number of pixels actually required to find a good match using any correlation based approach.  Using small patches (in my case between 16x16 and 32x32 pixels) has the advantage that each patch can usually be matched using a straightforward translation-only search.  After all (or most) of the patches are matched, a full affine deformation can still be computed from the individual match locations.

Figure1. High structure patch regions can be located using the structure tensor.  These can be used to reduce the number of pixels required to make a match.

Fast Feature Vectors

Even searching for a small number (e.g. 16) small patches over a reasonable search region takes too much time for most applications.   Fortunately, a significant accleration can be achieved by rapidly rejecting very poor quality matches using a cadidate search routine.  Most of the possible match positions will be very different from the correct match position.  We can rapidly discriminate between a candidate match and a non-match by comparing a few carefully selected pixels in each patch (Figure.2)


Figure 2.  A high structure patch can be further reduced into a feature pixel representation.  Key pixels can be used to rapdily reject non-matches during search.  The problem is determining which pixels are optimal to use.

The interesting part of the algorithm must decide which combination (and how many) pixels inside each patch are required to provide good discrimination between a possible match and a probable non-match.  I decide that a measure of 'goodness' for a set of pixels is found by auto-correlating the set with the patch.  Highly discriminative sets will have the fewest match locations, with a really good set having only one match location.

Clearly, there are an awful lot of potential combinations of pixels that could be used.  A  brute force test of all possible combinations is not really feasible.  We must cut down the search space.  We can guess that discriminating pixels lie on or near an edge that divides areas of different intensity.  The intensity of a pixel direclty on a strong edge is not very robust to sub-pixel translations, so those should probably not be used (unless we pre-blur).  We can guess that descriptive pixels do not lie on plain featureless regions either, since these pixels are not uniquely matchable.  In all, we can assume that the most discriminative pixels do not lie on the high gradient pixels (edges), and not on the lowest gradient pixels (too plain).  So if we restrict possible discriminative to those ones on a local contrast maximum we can reduce the optimisation considerably.

There are still a large number of possible combinations, but now few enough to enumerate and test in a feasible amount of time.  I limit this to the top 32 such pixels.

Every possible combination of four of these candidate pixels is tested as a potential discriminating arrangement.  This is a total of 32^4, or about a million possible combinations to be tested.  Every combination of four pixels is auto-correlated with the entire patch search region and the number of matches counted.  The combination that yields the fewest false matches is chosen as the most discriminating set.  This pixel set can be used at run time to rapidly reject false match locations with only the locations that pass this quick test going forward to full patch correlation.

This process cuts a multi-megapixel image correlation down to a very limited set key patches, with a very limited number of pixels in each patch required to find a match location.  I have used this algorithm successfully for very rapid alignment of printed material without recourse to pyramid formation or more complex feature matching. 


Vision Experts



1 comment: