I've recently been looking at methods to accelerate one of my pattern matching algorithms and thought I'd share some of my techniques with you. It takes alot of precomputation, but then alignment is really fast. The algorithm is intended for registration of a high resolution template to a similar target image, for applications when some alignment is required but not a full image search.
Feature Patches
Performing image correlation is slow, but it can be accelerated by not using every pixel. The algorithm I'm using splits a high resolution trained template image into multiple small feature patches. Only the best patches that have the highest structure tensor are then used for alignment (Figure 1). The reduction of the image to its key features reduces the amount of correlation required significantly.
Patches that are close together are pruned, leaving the more widely spaced patches. We don't want all our patches clustered together since that would make determination of global rotation or scaling more sensitive to local distortions. This patch feature extraction dramatically cuts down the number of pixels actually required to find a good match using any correlation based approach. Using small patches (in my case between 16x16 and 32x32 pixels) has the advantage that each patch can usually be matched using a straightforward translation-only search. After all (or most) of the patches are matched, a full affine deformation can still be computed from the individual match locations.
Figure1. High structure patch regions can be located using the structure tensor. These can be used to reduce the number of pixels required to make a match. |
Fast Feature Vectors
Even searching for a small number (e.g. 16) small patches over a reasonable search region takes too much time for most applications. Fortunately, a significant accleration can be achieved by rapidly rejecting very poor quality matches using a cadidate search routine. Most of the possible match positions will be very different from the correct match position. We can rapidly discriminate between a candidate match and a non-match by comparing a few carefully selected pixels in each patch (Figure.2)
The interesting part of the algorithm must decide which combination (and how many) pixels inside each patch are required to provide good discrimination
between a possible match and a probable non-match. I decide that a measure of 'goodness' for a set of pixels is found by auto-correlating the set with the patch. Highly discriminative sets will have the fewest match locations, with a really good set having only one match location.
Clearly, there are an awful lot of potential combinations of pixels that could be used. A
brute force test of all possible combinations is not really feasible. We must cut down the search space. We can guess that discriminating pixels lie on or near an edge that divides areas of different intensity. The
intensity of a pixel direclty on a strong edge is not very robust to sub-pixel
translations, so those should probably not be used (unless we pre-blur). We can guess that descriptive
pixels do not lie on plain featureless regions either, since these pixels are
not uniquely matchable. In all, we can assume that the most discriminative
pixels do not lie on the high gradient pixels (edges), and
not on the lowest gradient pixels (too plain). So if we restrict possible discriminative to those ones on a local contrast maximum we can reduce the optimisation considerably.
There are still a
large number of possible combinations, but
now few enough to enumerate and test in a feasible amount of
time. I limit this to the top 32 such pixels.
Every
possible combination of four of these candidate pixels is tested as a
potential discriminating arrangement. This is a total of 32^4, or about
a million possible combinations to be tested. Every combination of four
pixels is auto-correlated with the entire patch search region and the number of matches counted. The
combination that yields the fewest false matches is chosen as the most
discriminating set. This pixel set can be used at run time to rapidly
reject false match locations with only the locations that pass this quick test going forward to full patch correlation.
This process cuts a multi-megapixel image correlation down to a very limited set key patches, with a very limited number of pixels in each patch required to find a match location. I have used this algorithm successfully for very rapid alignment of printed material without recourse to pyramid formation or more complex feature matching.
Vision Experts