fundamentals.image.sift-matching.xml Maven / Gradle / Ivy
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> <chapter id="sift-and-feature-matching"> <title>SIFT and feature matching</title> <para> In this tutorial we’ll look at how to compare images to each other. Specifically, we’ll use a popular <emphasis role="strong">local feature descriptor</emphasis> called <emphasis role="strong">SIFT</emphasis> to extract some <emphasis>interesting points</emphasis> from images and describe them in a standard way. Once we have these local features and their descriptions, we can match local features to each other and therefore compare images to each other, or find a visual query image within a target image, as we will do in this tutorial. </para> <para> Firstly, lets load up a couple of images. Here we have a magazine and a scene containing the magazine: </para> <programlisting>MBFImage query = ImageUtilities.readMBF(new URL("http://dl.dropbox.com/u/8705593/query.jpg")); MBFImage target = ImageUtilities.readMBF(new URL("http://dl.dropbox.com/u/8705593/target.jpg"));</programlisting> <para role="centered"> <inlinemediaobject> <imageobject> <imagedata fileref="../../figs/query.jpg" format="JPG" align="center" contentwidth="5cm"/> </imageobject> </inlinemediaobject> <inlinemediaobject> <imageobject> <imagedata fileref="../../figs/target.jpg" format="JPG" align="center" contentwidth="5cm"/> </imageobject> </inlinemediaobject> </para> <para> The first step is feature extraction. We’ll use the <emphasis role="strong">difference-of-Gaussian</emphasis> feature detector which we describe with a <emphasis role="strong">SIFT descriptor</emphasis>. The features we find are described in a way which makes them invariant to size changes, rotation and position. These are quite powerful features and are used in a variety of tasks. The standard implementation of SIFT in OpenIMAJ can be found in the <code>DoGSIFTEngine</code> class: </para> <programlisting>DoGSIFTEngine engine = new DoGSIFTEngine(); LocalFeatureList<Keypoint> queryKeypoints = engine.findFeatures(query.flatten()); LocalFeatureList<Keypoint> targetKeypoints = engine.findFeatures(target.flatten());</programlisting> <para> Once the engine is constructed, we can use it to extract <code>Keypoint</code> objects from our images. The <code>Keypoint</code> class contain a public field called <code>ivec</code> which, in the case of a standard SIFT descriptor is a 128 dimensional description of a patch of pixels around a detected point. Various distance measures can be used to compare <code>Keypoint</code>s to <code>Keypoint</code>s. </para> <para> The challenge in comparing <code>Keypoint</code>s is trying to figure out which <code>Keypoint</code>s match between <code>Keypoint</code>s from some query image and those from some target. The most basic approach is to take a given <code>Keypoint</code> in the query and find the <code>Keypoint</code> that is closest in the target. A minor improvement on top of this is to disregard those points which match well with MANY other points in the target. Such point are considered non-descriptive. Matching can be achieved in OpenIMAJ using the <code>BasicMatcher</code>. Next we’ll construct and setup such a matcher: </para> <programlisting>LocalFeatureMatcher<Keypoint> matcher = new BasicMatcher<Keypoint>(80); matcher.setModelFeatures(queryKeypoints); matcher.findMatches(targetKeypoints);</programlisting> <para> We can now draw the matches between these two images found with this basic matcher using the <code>MatchingUtilities</code> class: </para> <programlisting>MBFImage basicMatches = MatchingUtilities.drawMatches(query, target, matcher.getMatches(), RGBColour.RED); DisplayUtilities.display(basicMatches);</programlisting> <mediaobject> <imageobject> <imagedata fileref="../../figs/matches.png" format="PNG" align="center" contentwidth="5cm"/> </imageobject> </mediaobject> <para> As you can see, the basic matcher finds many matches, many of which are clearly incorrect. A more advanced approach is to filter the matches based on a given geometric model. One way of achieving this in OpenIMAJ is to use a <code>ConsistentLocalFeatureMatcher</code> which given an internal matcher and a model fitter configured to fit a geometric model, finds which matches given by the internal matcher are consistent with respect to the model and are therefore likely to be correct. </para> <para> To demonstrate this, we’ll use an algorithm called Random Sample Consensus (RANSAC) to fit a geometric model called an <emphasis role="strong">Affine transform</emphasis> to the initial set of matches. This is achieved by iteratively selecting a random set of matches, learning a model from this random set and then testing the remaining matches against the learnt model. </para> <tip>An Affine transform models the transformation between two parallelograms.</tip> <para> We’ll now set up a RANSAC model fitter configured to find Affine Transforms (using the <code>RobustAffineTransformEstimator</code> helper class) and our consistent matcher: </para> <programlisting>RobustAffineTransformEstimator modelFitter = new RobustAffineTransformEstimator(5.0, 1500, new RANSAC.PercentageInliersStoppingCondition(0.5)); matcher = new ConsistentLocalFeatureMatcher2d<Keypoint>( new FastBasicKeypointMatcher<Keypoint>(8), modelFitter); matcher.setModelFeatures(queryKeypoints); matcher.findMatches(targetKeypoints); MBFImage consistentMatches = MatchingUtilities.drawMatches(query, target, matcher.getMatches(), RGBColour.RED); DisplayUtilities.display(consistentMatches);</programlisting> <mediaobject> <imageobject> <imagedata fileref="../../figs/matches-affine.png" format="PNG" align="center" contentwidth="5cm"/> </imageobject> </mediaobject> <para> The <code>AffineTransformModel</code> class models a two-dimensional Affine transform in OpenIMAJ. The <code>RobustAffineTransformEstimator</code> class provides a method <code>getModel()</code> which returns the internal Affine Transform model whose parameters are optimised during the fitting process driven by the <code>ConsistentLocalFeatureMatcher2d</code>. An interesting byproduct of using the <code>ConsistentLocalFeatureMatcher2d</code> is that the <code>AffineTransformModel</code> returned by <code>getModel()</code> contains the best transform matrix to go from the query to the target. We can take advantage of this by transforming the bounding box of our query with the transform estimated in the <code>AffineTransformModel</code>, therefore we can draw a polygon around the estimated location of the query within the target: </para> <programlisting>target.drawShape( query.getBounds().transform(modelFitter.getModel().getTransform().inverse()), 3,RGBColour.BLUE); DisplayUtilities.display(target); </programlisting> <mediaobject> <imageobject> <imagedata fileref="../../figs/fittedmodel.png" format="PNG" align="center" contentwidth="5cm"/> </imageobject> </mediaobject> <sect1 id="exercises-6"> <title>Exercises</title> <sect2 id="exercise-1-different-matchers"> <title>Exercise 1: Different matchers</title> <para> Experiment with different matchers; try the <code>BasicTwoWayMatcher</code> for example. </para> </sect2> <sect2 id="exercise-2-different-models"> <title>Exercise 2: Different models</title> <para> Experiment with different models (such as a <code>HomographyModel</code>) in the consistent matcher. The <code>RobustHomographyEstimator</code> helper class can be used to construct an object that fits the <code>HomographyModel</code> model. You can also experiment with an alternative robust fitting algorithm to RANSAC called Least Median of Squares (LMedS) through the <code>RobustHomographyEstimator</code>. </para> <tip>A HomographyModel models a planar Homography between two planes. Planar Homographies are more general than Affine transforms and map quadrilaterals to quadrilaterals.</tip> </sect2> </sect1> </chapter>
© 2015 - 2025 Weber Informatics LLC | Privacy Policy