ICP (version 1.0, 2010-August-22) |
ICP.py
Version: 1.0
Author: Avinash Kak (kak@purdue.edu)
Date: 2010-August-22
Download Version 1.0
gztar
bztar
View version 1.0 code in browser
Switch to Version 1.1
Switch to Version 1.2
INTRODUCTION:
ICP stands for Iterative Closest Point algorithm. ICP algorithms
are used to register two data sets (meaning making one data set
spatially congruent with the other data set) by applying iteratively a
rotation and translation to one data set until it is congruent with
the other data set.
In the domain of image processing, ICP can be used to register a data
image recorded through a sensor against a model image that may be produced
by a geographic information system (GIS). A typical application would
be a UAV recording images as it flies over a terrain; a successful
registration between such sensor images and the the model images produced
by an on-board or satellite-connected GIS system would enable precise
computation of the position and the orientation of the UAV vis-a-vis
the terrain.
The main goal of the pure-Python implementation of ICP presented here
is to make it easier to experiment with the different aspects of such
algorithms.
USAGE:
For the case of binary images, a simple usage example would look like:
import ICP
icp = ICP.ICP(
model_image = "triangle1.jpg",
data_image = "triangle2.jpg",
binary_or_color = "binary",
iterations = 40,
display_step = 1,
debug = 1 )
icp.icp()
icp.display_results()
For registering a shape in one image with a shape in another image,
you'd use the binary case shown above. By binary image, we mean an image
in which the only non-zero pixels are on the shape boundaries.
For the case of color images (including the case of grayscale images),
the simplest calls to the module would look like
import ICP
icp = ICP(
model_image = "modelterrain.jpg",
data_image = "image_from_camera.jpg",
binary_or_color = "color",
iterations = 40,
connectivity_threshold = 5,
output_image_size = 100,
display_step = 1,
debug = 1 )
icp.icp()
icp.display_results()
where the parameter output_image_size controls the size of the image
that will actually be used for ICP calculations. Color (and grayscale)
images as output by sensors can be very large and it would be impractical
to process all of the pixel data in the images. This module first
smoothly reduces the size of the images so that the maximum dimension
does not exceed output_image_size and then carries out ICP processing
on the reduced-size images. The parameter connectivity_threshold
controls how many of the edge pixels will be chosen for processing.
In the current implementation, an edge pixel is chosen if it has at
least connectivity_threshold number of other edge pixels in its 5x5
neighborhood.
The module also includes a static method gendata() to illustrate how
one can create simple synthetic images to experiment with this ICP module.
A call to this method looks like
import ICP
ICP.gendata()
As currently programmed, the gendata() method constructs images with
triangels and straight lines segments.
CONSTRUCTOR PARAMETERS:
model_image: The name of the model file (REQUIRED)
data_image: The name of the data file (REQUIRED)
binary_or_color: Set to 'binary' for binary images and to 'color'
for grayscale and color images (REQUIRED)
output_image_size: The size to which a large image will be reduced
for ICP processing (DEFAULTS to 100 for large
images)
iterations: The maximum number of iterations to try (DEFAULTS
to 40)
error_threshold: When the difference in the registration errors
between two successive iterations is less than
the error_threshold, we stop iterating. (DEFAULTS
to 0.0)
display_step: Controls how often the images at the end of successive
iterations will be shown in the final display of the
results. For example, when set to 5, every fifth image
will be included for the final display. (DEFAULTS to
5 when the number of iterations exceed 40 and to 2
otherwise)
connectivity_threshold:
This parameter decides as to which edge pixels in a
grayscale or a color image will participate in the ICP
calculations. For an edge pixel to be accepted, it
must have at least connectivity_threshold number of
other edge pixels in a 5x5 neighborhood. (DEFAULTS to 5)
save_output_images: When set to 1, the module will save the images produced
at the end of each iteration in the current directory.
Such images are named "__resultX.jpg" for integer values
of X that start from 0. (Defaults to 0)
debug: When set to 1, the module prints out additional
information useful for debugging purposes. (DEFAULTS
to 0)
HOW THE RESULTS ARE DISPLAYED:
The results are displayed using Tkinter graphics (meaning, actually, Tk
graphics via the Tkinter interface). For binary images, the first row
shows the model image followed by the data image. Subsequent rows display
the data image being translated and rotated as it comes into congruence
with the model image.
For color (and grayscale) images, the first row displays six images, three
for the model and three for the data. In each set of three images, the
first image is the actual photograph, the second the edges in the photo,
and the third the pixels actually used for ICP calculations. Subsequent
rows display the data image being brought into registration with the model
image through successive iterations. Note that whereas the ICP calculations
are carried out on only the sparse set of pixels shown as the third image
in the model and the data sequences in the first row, the computed
transformations are applied to the edge images shown as the second image
for each.
The display of results requires access to the "times.ttf" font file for
the labels that are needed in the display. The module assumes that
this file can be located through the paths stored in 'sys.path', or
via the paths available through your environment variables, or at the
following path:
/usr/share/fonts/truetype/msttcorefonts/times.ttf
which is typical for a Ubuntu install. If that is not the case, you would
need to edit the 11th line of the code shown for display_results() method
and enter the pathname to where "times.ttf" can be found.
CAVEATS:
Within the limitations caused by the fact that ICP calculations can get
trapped in a local minimum, you can expect to get good results on binary
images. As for color images, what results you get would depend greatly
on how complex the images are, the quality of edge detection, and how
the final selection of pixels for ICP computation is carried out. The
implementation shown in this module uses very rudimentary edge
detection and even more rudimentary final pixel selection. (That is
because the focus of this module is on ICP and not on digital image
processing.) To improve the results on color (and grayscale) images,
you would need to replace the pixel extraction functions with your own
implementations. Another option would be carry out pixel extraction
separately and to then feed the pixels thus selected as binary images
to this module.
When processing color (and grayscale) images, if you are free to
decide as to which image to call model and which to call data, choose
the one that results in fewer pixels for ICP processing as the
data image. You are likely to get better results if you do so.
INSTALLATION:
The ICP class was packaged using Distutils. For installation, execute the
following command-line in the source directory (this is the directory that
contains the setup.py file after you have downloaded and uncompressed the
package):
python setup.py install
You have to have root privileges for this to work. On Linux distributions,
this will install the module file at a location that looks like
/usr/lib/python2.6/site-packages/
If you do not have root access, you have the option of working directly off
the directory in which you downloaded the software by simply placing the
following statements at the top of your scripts that use the ICP class:
import sys
sys.path.append( "pathname_to_ICP_directory" )
To uninstall the module, simply delete the source directory, locate where
ICP was installed with "locate ICP" and delete those files. As mentioned
above, the full pathname to the installed version is likely to look like
/usr/lib/python2.6/site-packages/ICP*
If you want to carry out a non-standard install of the ICP module, look up
the on-line information on Disutils by pointing your browser to
http://docs.python.org/dist/dist.html
THEORETICAL BASIS:
The first ICP algorithm was proposed by Paul Besl and Neil McKay in a now
celebrated paper that appeared in 1992 in IEEE Transactions on PAMI.
Since then various versions of the algorithm have been published by other
folks for either speeding up the performance of the algorithm or for
improving its accuracy.
The algorithm implemented here is as simple as it can be. We model the
relationship between model and data as
R x_d + T = x_m
where x_d denote the data points and x_m the model points, each a two-element
vector. R is a 2x2 rotation matrix and T a 2-element translation vector.
Since two planar figures may not be related by the above transformation
(even when one figure "appears" to be a rotated version of the other figure)
for an arbitrary location of the origin, move the origin to the mean point
of the model by
x_m = x_m - mean( x_m )
x_d = x_d - mean( x_m )
With regard to the calculation of R and T, let's express the data points and
the CORRESPONDING model points as
[x_d1, x_d2, ....., x_dn]
and their CORRESPONDING model points
[x_m1, x_m2, ....., x_mn]
We can now write the following for estimating the rotation matrix:
R . A = B
where
A = [x_d1, x_d2, ....., x_dn]
B = [x_m1 - T, x_m2 - T, ....., x_mn - T]
Both A and B are 2xn matrices. We can now write
R . A . A^t = B . A^t
where A^t is the transpose of A. So, we have the following as a
least mean squares estimate for R:
R = B . A^t . ( A . A^t )^-1
Since such an R may not obey the strict properties that must apply
to a rotation matrix (it must be orthonormal), we condition it by
first subjecting it to a singular value decomposition:
U.S.V^t = svd( R )
and then writing the following for a better estimate for R:
R = U . V^t
We can now estimate T by
T = mean( x_m ) - mean( R x_d )
The above assumes that we are carrying out an one-shot calculation of R
and T. But ICP is iterative. The following applies to an iterative
implementation of the above:
For an iterative assessment of R and T, let's assume that we can decompose
R as
R = \deltaR . R_0
where we have previously calculated R_0 and now we wish to refine the
estimate with the calculation of \deltaR. Plugging the above in the previous
formulation:
\deltaR . R_0 . A . A^t = B . A^t
implying
\deltaR . R_0 = B . A^t . ( A . A^t )^-1
which is to say:
\deltaR = B . A^t . ( A . A^t )^-1 . R_0^t
After you have calculated \deltaR, you can update the rotation matrix by
R = \deltaR . R_0
At the end of each such update of the rotation matrix, the calculation of
the translation vector remains the same as before
T = mean( x_m ) - mean( R . x_d )
We can therefore write down the following steps for ICP computation:
STEP 1:
x_m = x_m - mean(x_m)
x_d = x_d - mean(x_m)
STEP 2:
Initialize R and T:
R = 1 0
0 1
T = 0
0
old_error = inf
STEP 3: error = (1/N) \sum dist( x_m - (R * x_d + T) )
STEP 4: diff_error = abs(old_error - error)
if (diff_error > threshold):
old_error = error
else:
break
STEP 5: for each x_d find its closest x_m by finding that x_m
which minimizes the squared difference between the
two sides of
R . x_d + T = x_m
STEP 6: A = [x_d1, x_d2, ....., x_dn]
B = [x_m1 - T, x_m2 - T, ....., x_mn - T]
Note that A will remain the same for ICP iterations,
but B can change with each iteration depending on the
what corresponding pixels are found for each data pixel.
STEP 7: AATI = A^t inverse(A * A^t)
STEP 8: R_update = B * AATI * R.T
STEP 9: U,S,VT = svd(R_update)
deter = determinant(U * VT)
U[0,1] = U[0,1] * determinant
U[1,1] = U[1,1] * determinant
R_update = U * VT
STEP 10: R = R_update * R
STEP 11: T = mean(x_m) - mean( R * x_d )
STEP 12: Back to Step 3
ACKNOWLEDGMENTS:
The author would like to thank Chad Aeschliman and Kyuseo Han for
their considerable help with the debugging of the code presented here.
ABOUT THE AUTHOR:
Avi Kak is in the last stages of finishing up his multi-year Objects
Trilogy project. See his web page at Purdue for what this project is
all about.
Modules | ||||||
|
Classes | ||||||||
|
Data | ||
__author__ = 'Avinash Kak (kak@purdue.edu)' __copyright__ = '(C) 2010 Avinash Kak. Python Software Foundation.' __date__ = '2010-August-22' __url__ = 'http://RVL4.ecn.purdue.edu/~kak/distICP/ICP-1.0.html' __version__ = '1.0' |
Author | ||
Avinash Kak (kak@purdue.edu) |