Augmented Reality - A Better Vision

Augmented reality (AR) brings about the digital information combined with live video as well as the user environment to the real world scenario. To bring the existing image into the real world, augmented reality uses the help of computer created sensory inputs like GPS, graphics, sound or video data etc. so as to blend more information to it. With the improvement of Augmented Reality technology, the surrounding real world belonging to the user becomes more of an interactive one and gets manipulated digitally. Augmented reality gives a sense of illusion or a sort of virtual reality.

An Augmented Reality system is said to share the following properties like the blending of images from the virtual as well as the real world with the real environment, making it interactive in real time, 3D registered. From its features, Augmented Reality turns out to be a newest type of user interface that is real time in nature which supports human interaction with digital devices and objects. Augmented Reality view is brought about by performing mainly 4 basic tasks that are distinct from each other and the results are combined to generate a useful output.

Basic Function of Augmented Reality

  1. Scene capture: Firstly, a view or reality that needs to be augmented should be captured using any of the devices used for video capturing such as a camera or head-mounted display.
  2. Scene identification: The reality that is captured from the first step is scanned to detail the correct position for embedding the virtual content. The exact position is identified either by visual tags (markers) or by technologies meant for tracking such as sensors, lasers or GPS.
  3. Scene Processing: With the scenes becoming clearly identified and recognized, the request is given out for the related virtual content, from the Internet or any other forms of database.
  4. Scene Visualization: Finally, a combined image of the real as well as virtual content is produced by Augmented Reality system.

Reality-Virtuality continuum

The position of Augmented Reality with respect to the real world and Virtual Reality (VR) is discussed. The concept introduced by Milgram defines the concept of mixed reality and essays the link between virtual and real world. According to this concept, the real world is placed at one extreme of the continuum and the virtual reality is at the other extreme, in which case Augmented Reality occupies the space nearer to the real world. When a system is closer to the Virtual Reality end, real elements in it is reduced. Depending on the displays used in Augmented Reality the proximity to real and virtual world can be increased or decreased. Systems of Augmented Reality using optical see-through displays will be placed nearer to the real world rather than the ones using video-mixing. Augmented virtuality is the augmenting of virtual world with real world scenes just like augmented reality where real world is augmented with virtual objects. Reality-virtuality continuum concept places augmented virtuality nearer to the virtual reality end.

Reality Virtuality Continuum

Simple Augmented Reality

The components involved in simple augmented reality are camera, a computational unit and a display. The image is captured by the camera, virtual objects are augmented on the top of the image by the system and the results are displayed. The figure below shows the flowchart for simple Augmented Reality.

Simple Augmented Reality

The camera captures the image which is then captured by the capturing module. The tracking module is meant to track the specific location and correct orientation for virtual overlay. Virtual components and the original image are combined by the rendering module using the pose and then facilitate the augmented image on the display.

The ‘heart’ of the AR system is the tracking module which is used to calculate the relative pose in real time of the camera. Pose means the 3D location and orientation of an object which is also known as the six Degrees of Freedom (DOF) position. This module allows the system to add virtual objects into the real world. This module enables virtual objects in Augmented Reality to be moved and rotated in 3D coordinated rather than 2D in case of other image processing tools. Markers are used to calculate pose in a simple manner. Other than marker-based augmented reality, there exists feature-based and hybrid tracking. There exist video capturing libraries which are readily available for the purpose of image acquisition. Toolkits and libraries related to augmented reality also provide the required support for capturing.

The virtual image is drawn on the top of the real image captured by camera by the rendering module. In augmented reality, a virtual camera is used, which is identical to the real camera used by the system. This projects the virtual objects in a similar manner to the real objects and gets a convincing result. To get an exact replica of the real camera, the optical characteristics of the camera should be known to the system. Camera calibration process is used identify these characteristics. This process can be a part of the augmented reality system or else it can be a separate process. Calibration tool comes with many toolkits like ALVAR and ARToolKit.  

Depending on the application the camera used for the Augmented Reality system can be digital camera, FireWire camera, USB camera etc. The computational unit can be any of the high-end devices like PC, laptop, mobile phone etc. The display unit can be head-mounted display, built-in camera and external display or projected onto real world. The augmentation setup depends on the application as well as the environment.

Augmented Reality system classification

AR system can be classified depending on the hardware (type of tracking system) or visualization approaches (optical see-through, video-mixing) or range (outdoor, indoor) or communication (hardwired, wireless). Most common classification of Augmented Reality system is based on the visualization approaches, that is, Video-mixing and Optical see-through. For both of these approaches, the display can be varied with respect to where the captured images are visualized, that is, on a Head Mounted Display or desktop screen.

Types of tracking systems used in Augmented Reality hardware can be marker-based, feature-based or hybrid tracking. The following sections provide a brief idea of these tracking sections.

Marker-Based Tracking

Tracking is meant to calculate relative pose of the camera at real time. In almost all AR systems camera is part of it, in which case, visual tracking methods are of great importance. In visual tracking, the pose of the camera is deduced from observing what it sees.  In an unknown environment, this process is challenging. Primarily, this step takes lot of time to acquire the data required to configure the pose and also the orientation of the coordinate axes are selected in random, which is inconvenient to the user. Also deduction of pose from just observation is prone to error. A solution to overcome this issue is to use a predefined easily detectable sign in the environment and computer vision methods are used to detect it. A marker is a sign that can be detected by a computer system from a video image using computer vision technique, image processing and pattern recognition. Once the marker is detected, both the scale and pose of the camera is defined. This approach is widely used in Augmented Reality and is called marker-based tracking. This method is popular among other tracking techniques because of the easiness of implementation as well as due to the availability well-known marker based toolkits.

Marker Based Tracking

Marker based tracking can be used in viewing how a product fits in its defined place before actually purchasing it. For e.g., the above figure shows how 3D models of objects fit into a desired space in the room. Also rearranging the furniture in a room without actually doing it in real by placing markers at desired locations and seeing how it looks. 

Feature-Based Tracking

This is another tracking method used for tracking the relative pose of the camera. The localized features involved in this method are divided into 3 categories, i.e., feature points, feature descriptors and edges. A feature point can be considered as a small area contained in an image with a well-defined position and clear definition. Feature descriptor term is referred to the characteristics part of an image region. Edges are considered as outlines or profiles of objects which may also appear in other regions. Edge matching is done based on the profile and orientation of edges. Applications related to AR, edge detection and matching is commonly used in model-based tracking. Edge detection in Augmented Reality is used in occlusion handling where straight line segments are used by the system to deduce vanishing points to calculate camera parameters.

Feature Based Tracking             

A good feature represents a clear definition in mathematical form with a well-defined position in the image space and the feature has a local image structure around it containing lots of diverse information. A feature is good provided it doesn’t undergo changes and remain invariant due to perspective transformation, rotation, scale, local and global illumination etc. A good feature in case of tracking purposes is such that it can be detected robustly at different times.

Two methods used to find the feature points and their respective correspondences:

  • Tracking only: Selection of those features that can be tracked locally.
  • Detection and Tracking: First of all, features are detected and depending on their local appearance, matching is done.

Third method Detection & local matching / detection & tracking combines the above two methods. This method is a compromise between speed and accuracy. This method is widely used in Augmented Reality applications.

Feature selection, detection and matching or tracking are interlinked and tracking systems combine these steps in the same process. These systems tracks real time features and matches it with backgrounds.

Hybrid Tracking

This system combines two or more tracking methods that is, model-based tracking and sensor tracking methods. One of the early examples of hybrid tracking is Intersense’s hybrid system. This system uses vision based system along with inertial tracker. In this system, pose and the relative position of the camera and inertial tracker are fixed. Using inertial tracker, the position of the markers in view is predicted by the system and thereby search window is limited so that the image analysis part speeds up.

The AR systems are classified according to visualization or display approaches as Monitor Based Augmented Reality, See-Through Augmented Reality and Spatially Augmented Reality systems.

Monitor-based Augmented Reality system

Monitor-based AR

This system permits the viewer to see the real world objects as well as the virtual objects being super-imposed on the regular display thereby eliminating the usage of special glasses. This method is commonly used for testing systems in laboratories and implementing low-cost demos.  To combine the virtual object into the real world scene, the correct position of the virtual object with respect to the real world scene is required. The frames of the real world scene and that generated from computer are combined as the visualization is made on the normal display.

See-Through Augmented Reality systems

See-Through AR


The surrounding environment is observed by the user in case of see-through Augmented Reality systems. This method provides maximum perception to the user about the real world. Mirrors are used for superimposing both the computer generated as well as the real world scene to achieve display augmentation.

The computer generated visuals is projected to the real world taking into account the position and the user’s head obtained by tacking systems. The virtual image should move according to the position change of the user. During the designing of optical see-through displays many problematic issues like user’s accurate position, calibration and matching of viewpoints, sufficient field of view, perceptual issues etc. has to be resolved.


To overcome technical difficulties the optical see-through technique is replaced by video-based HMD thus forming ‘video see-through’ display. The below figure shows video see-through Augmented Reality system.

Video see-through AR


Video camera records the real world scene and is meant to perform projection of 3D image on to 2D image plane. Images projected on the image plane by the camera are determined by its internal parameters like lens distortion and focal length, and the external parameters like position and orientation. In an object referral plane computer graphics system does the generation of virtual image. The information on parameters is required for the graphics system to render virtual objects into the real world scene and also determine the position and orientation of the virtual camera which is used for the generation of images belonging to virtual objects. Thus obtained image is combined with the real scene image to give augmented reality view. 


Spatially Augmented Reality systems

In this approach, the 2D imagery that is generated is part of the fixed physical display surface rather than attaching it to the moving head of the user.

SAR has an occlusion relationship that is different from see-through Augmented Reality systems. That is, in SAR systems, real object is able occlude virtual objects but the reverse does not happen i.e. virtual object cannot obstruct the real world scene. A fixed world co-ordinates system is used in SAR to render computer generated virtual objects. Only one active user head-tracking is allowed by SAR in any environment since the images are produced not in individual user space but in physical environment. A solution to this issue is adding shuttered glasses that are time multiplexed for the addition of more users.

Available displays for Augmented Reality systems

Head Mounted Displays (HMDs) which are a set goggles or helmet which provide screens or monitors in front of each eye so as to generate 3D form of the images seen the user. HMD comes with head tracker so that the images on HMD adjusted as the user’s head moves. During beginning stage, HMDs were heavy and significantly large but nowadays it has reduced to the size of sunglasses.

Optical ST HMDs places optical combiners in front of the viewer’s eyes still providing a direct view of the real world. These are partially trans-missive allowing the user to see the real world directly through them. Also it is partially reflective allowing the users to see virtual images being bounced off from head mounted display’s combiners. Depending on the combiners used the brightness of the real world scene suffers reduction in brightness. This type of display also requires advanced calibration/tracking to ensure correct registration of virtual content. 

Optical see-through HMD

Video ST HMDs works as a combination of closed view HMD along with 1 or 2 head-mounted video cameras. Video camera captures real images and these images are blended with scene generator created graphic image. The resultant image is sent to the monitors that are placed in front of the user’s eyes within a closed view HMD. The major disadvantage of this type of display is the sampling done at the rate of camera’s video resolution and also depends on the quality of image sensor. This method may introduce system delays which turn out to be dangerous in case of certain applications. This method depends heavily on the ability of the camera to capture real images and their visualization will fail once the camera faces problems of power loss, overexposure etc.          

Video see-through HMD

See-through displays are a cloned version of HMD that is also used to impose virtual objects onto real world scene. Micro Optical, Personal monitor, Olimpus, Hitachi and Sony are a few companies which produce see-through glasses.

Augmented Reality Applications

AR has a wide range of applications which include fields like medicine, entertainment, engineering design, industries, military, robotics, manufacturing, consumer applications etc.


In medical field, augmented reality find a number of applications. In surgery, augmented reality can be used as training aid as well as a visualization method. Non-invasive sensors like Computed Tomography scans (CT), ultrasound imaging or Magnetic Resonance Imaging (MRI) are used to get the 3-D datasets in real time belonging to patients. The doctor gets an X-ray vision of the patient by combining and rendering the datasets in real time keeping in view the real patient. In minimally-invasive surgery, the trauma due to the operation is reduced by the usage of zero incisions or small incisions. A drawback of these minimally-invasive methods is the reduction in the ability of the doctor to view inside the patient, making the operation difficult. Augmented reality provides a better internal view of the patient without larger incisions. In medical visualization, augmented reality helps to display the information on the patient about the position to perform the operation. Augmented Reality provides data from both non-invasive sensors as well as that from the naked eye to the surgeons to perform surgery with utmost precision in case of drilling a hole into the skull or when performing needle biopsy in case of tiny tumor. As a training aid, AR provides a surgeon with the steps to be followed while performing an operation. A research group has carried out scanning a pregnant woman’s womb using ultrasound sensor which generates a 3D depiction of fetus which is inside the womb and displays it in the see-through HMD. This helps the doctor to view the different movements of the fetus.

Medical Application

Mobile Phones

IPhone and phones that run on Android operating system comes with different versions of augmented reality. There are many apps found in the market that use augmented reality principles for easy access to places of interest. Layar, an app, helps the user to get the information about the surroundings. When positioned towards a building, this application is capable of acquiring all related information about the building which includes the vacancies available in the specific building, photos of that building available in Flickr etc.  


Wikitude is another popular AR application in mobiles which provides software that comes with information in context-sensitive form which is, capable of understanding objects or identifying locations so as to link the real-world situations along with the digital information. The software is designed to run on any smartphone and is capable of displaying digital information related to user’s surroundings within a view that is applicable to mobile camera. This digital information additionally provided may be related to the user’s places of interest like nearby restaurants, shops, routes etc. The software consists of user location identification using GPS and Wi-Fi, 3D modeling and image recognition.


Video Games

Video Games

AR is used in video games to give user’s a treat by indulging the digital play to the real world environment. Over the last 10 years, improvements in technology has about better augmented reality games that detect the movements with precision and changes accurately with the player’s movements. There are many video game companies available in market that use of augmented reality which include Total Immersion, Merchlar etc.


Augmented reality has been on the path of providing a visual treat to viewer’s from the moment it became part of Hollywood. The Hollywood storytellers like James Cameroon have been using Augmented Reality to describe the vivid adventures in the movie ‘Avatar’ and hopefully this method will be carried out for long time by other storytellers too. In olden times, the creation of movies which included complex special effects was made in traditional methods using props. A ball on a stick to foam, these props helped in filming the interaction between live elements and computer generated assets. This traditional method always slowed down the production process with re-shooting scenes which showed timing error during post-production which brought about wastage of both money and time. Whereas AR allows viewing the real world scene being overlaid with computer generated assets. This helps the film-makers to shoot the scene correctly and make it free of timing errors.


In television, augmented reality found its first application in weather visualizations. Now weathercasting include display of images in full motion video from multiple cameras in real time and is also combined with 3D graphics and is mapped onto a virtual geo-space model in common. In sports telecasting, augmented reality through tracked camera feeds provides see-through and overlay augmentation for enhanced viewing. In swimming telecasting, Augmented Reality often adds a line across the swimming field to notify the position of the swimmers and the details of the current record holders to allow users to compare with ongoing race to judge about the performance. In future, augmented reality is expected to allow the TV viewers to be part of programs they are viewing. Virtual objects can be placed in programs and interaction with these objects can be carried out.



Augmented reality finds applications in military field where, it serves as a networked form of communication system that provides all the useful information about the battlefield in real time onto the soldier’s goggles. Soldier can mark the people and different objects with special indicators to give a warning about the potential dangers. Soldier’s navigation can be aided using virtual maps, the battlefield perspective can be viewed using 360º view camera imaging and this information can be transferred to the remote command centers to be viewed by military top leaders.Military helicopters and aircrafts use Head-Up Displays to combine vector graphics with real world scene to enable the pilot to get a better view of the situation. These graphics provides a way to aim at the target by the aircraft weapons besides providing navigation and flight information.



Augmented Reality applications can enhance the education curriculum for a better change with indulging video, text, graphics and audio into the real world of a student. Efficient interaction with computer generated graphics can be done by students so as to improve the learning and also understanding about various historical events. Students and instructors can learn through remote collaboration share a common virtual environment for learning and carry out interaction through virtual objects and learning materials.




Augmented reality need to overcome some more challenges like the range of GPS, its working in indoor limits etc. Also people find it discomfort able to view the superimposed information on smartphone screens due to its limited and they would preferably like it as in sixth sense or contact lenses that support AR like Google Glass, smart lenses from Microsoft and Google. A drawback of Augmented Reality is that people forget to see what is right in front of their eyes but look upon their devices to get information rather than talking with other people which reduces human relationships. Privacy is a major issue with image recognition application software combined with Augmented Reality which provides personal information from their social networking profiles by just pointing one’s phone at anyone.


Augmented Reality is one of technologies that have gained a lot of interest. With the basic operation that includes mixing of real world scene with virtual objects this technology allows an immersion level that no other virtual equipment can provide. The application of this technology includes medical, engineering, video games, mobile etc. Augmented reality has bright future with its implementation seen in many phases of daily life.

User Review (0)
Related Items