# Basic AR (Part 1)

### Introduction

The goal of this project is to find a plane in a video and project an image to fixed surface on that plane. Some of the techniques used in order to perform this task in real time include Hough Line detection to detect surface planes, SURF detection to detect interest points, basic KNN clustering to compare interest points between frames, RANSAC to determine the homography between frames, along with several optimization techniques in order to get the code to perform at the required 30 fps on HD video (1280×720 resolution).

The use of monocular AR combined with markerless tracking has been an actively researched topic since the mid-90s. Applications include consumer entertainment (especially on mobile and wearable devices), medical imaging, and architectural prototyping. A lot of modern AR research since the 2010s has been focused on implementing SLAM methodology.

All code and processing has been performed and tested on an Intel i5 2.3 GHz processor. All code has been written using Python 2.7 and OpenCV libraries.

### Algorithm Outline

Below is a list of the steps performed. A lot of the implementation ideas have been borrowed from the paper Markerless Tracking using Planar Structures in the Scene (by Simon, Fitzgibbon, and Zisserman).

• Initialization
1. Set image $I_1 \leftarrow frame.next$ (where $frame.next$ the first frame of a video).
2. Determine a surface plane in $I_1$.
3. Compute a homography $H_p$ that can be used to project onto the above surface plane.
4. Project object $A$ onto surface using $H_p \times A$
• Loop
1. Set $I_0\leftarrow I_1$ and $I_1 \leftarrow frame.next$.
2. Compute homography $H_1$ that projects $I_0$ to $I_1$.
3. Update the image projection homography $H_p \leftarrow H_w\times H_p$.
4. Project object $A$ onto surface using $H_p\times A$

For the remained of this report,  $H_w$ will be used to denote the homography between frames, and $H_p$ will be used to denote the homography to project the ad onto a surface.

### Plane Detection Using Hough Lines

To detect a plane upon initialization, a Hough Line detection method is used. First the edges are detected using a Canny Edge filter (i.e. a thresholded gradient of each image at each pixel). From there, pixels along each edge are converted from a point in Cartesian space to a curve in the Hough space $(r,\theta)$. Intersections (or near intersections) of multiple curves in Hough space correspond to lines in the original Cartesian space.

I assume there is a grid-like structure on the plane to which I am projecting my image. To take advantage of this assumption I do the following: After enough lines are detected they are each clustered into their own groups based on their slope. From there I assume the two largest clusters determine perpendicular lines on a plane. I use lines from the top two clusters to find 4 points onto which I will project my image. Note that this method will fail if there are no dominant lines in the image, or if the two most dominant line directions are not perpendicular.

In the above images the left image would result in good surface detection. The right image would result in bad surface detection since the dominant lines do not determine a grid on a plane.

To be continued…