View All Posts
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

A long time ago I did a blog post on how to do AR on the iPhone. The code was written pre iOS4.0 and used the UIGetScreenImage function. I’ve now updated the code to use the new 4.0 features.

The code is available here

With the advent of iOS4.0 we now have proper access to the real time camera stream and the use of UIGetScreenImage is not allowed anymore.

To access the camera we now use the AV Foundation framework. There’s a number of really good resources on how to use this. The WWDC video “Session 409 - Using the Camera with AV Foundation” along with the sample code from that session is a great resource (registered iPhone devs can download all the WWDC videos and sample code from iTunes by clicking here:

To access the camera we need to add the following frameworks to our application:

  • AVFoundation
  • CoreVideo
  • CoreMedia

The first step is to create an AVCaptureSession

// Create the AVCapture Session
session = [[AVCaptureSession alloc] init];

If you want to show what the camera is seeing then you can create a view preview layer - this is handy if you just want to draw stuff on top of the camera preview.

// create a preview layer to show the output from the camera
AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:session];
previewLayer.frame = previewView.frame;
[previewView.layer addSublayer:previewLayer];

In the code above I’ve assumed that you’ve created a view called previewView in your view hierarchy and positioned it where you want the camera preview to appear. The preview image will be aspect scaled to fit in the frame you specify. The next step is to get hold of the capture device - here we just ask the OS to give us the default device that supports video. This will normally be the back camera. You can use the AVCaptureDevice class to query what devices are available using the devicesWithMediaType method.

// Get the default camera device
AVCaptureDevice* camera = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];

We now need to create a capture input with our camera

// Create a AVCaptureInput with the camera device
NSError *error=nil;
AVCaptureInput* cameraInput = [[AVCaptureDeviceInput alloc] initWithDevice:camera error:&error];

And configure the capture output. This a bit more complicated. We allocate an instance of AVCaptureViewDataOutput and set ourself as the delegate to receive sample buffers - our class will need to implements the AVCaptureVideoDataOutputSampleBufferDelegate protocol. We need to provide a dispatch queue to run the output delegate on - you can't use the default queue.     We also set our desired pixel format - there are a couple of recommended output formats, CVPixelFormatType_32BGRA on all devices, kCVPixelFormatType_422YpCbCr8 on the 3G device and kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange on the 3GS and 4 devices. In this example we'll just use CVPixelFormatType_32BGRA. <br />

// Set the output
AVCaptureVideoDataOutput* videoOutput = [[AVCaptureVideoDataOutput alloc] init];
// create a queue to run the capture on
dispatch_queue_t captureQueue=dispatch_queue_create("catpureQueue", NULL);
// setup our delegate
[videoOutput setSampleBufferDelegate:self queue:captureQueue];

// configure the pixel format
videoOutput.videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA], (id)kCVPixelBufferPixelFormatTypeKey,

We can also configure the resolution of the images we want to capture.

Preset3G3GS4 back4 front
// and the size of the frames we want
 [session setSessionPreset:AVCaptureSessionPresetMedium];

Finally we need to add the input and outputs to the session and start it running:

// Add the input and output
 [session addInput:cameraInput];
 [session addOutput:videoOutput];
 // Start the session
 [session startRunning]; 

We’ll now start to receive images from the capture session on our implementation of didOutputSampleBuffer.

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {

To access the video data we need to use some of the functions from CoreVideo and CoreMedia. First we get hold of the image buffer from the sample buffer. We then need to lock the image buffer and then we can access its data.

// this is the image buffer
CVImageBufferRef cvimgRef = CMSampleBufferGetImageBuffer(sampleBuffer);
// Lock the image buffer
// access the data
int width=CVPixelBufferGetWidth(cvimgRef);
int height=CVPixelBufferGetHeight(cvimgRef);
// get the raw image bytes
uint8_t *buf=(uint8_t *) CVPixelBufferGetBaseAddress(cvimgRef);
size_t bprow=CVPixelBufferGetBytesPerRow(cvimgRef);

.... do something useful with the image here .....

With the image data you have a number of options, you can turn it into a UIImage:

CGColorSpaceRef colorSpace=CGColorSpaceCreateDeviceRGB();
CGContextRef context=CGBitmapContextCreate(buf, width, height, 8, bprow, colorSpace, kCGBitmapByteOrder32Little|kCGImageAlphaNoneSkipFirst);
CGImageRef image=CGBitmapContextCreateImage(context);
UIImage *resultUIImage=[UIImage imageWithCGImage:image];

Or you can just process the data directly. The image will be stored in BGRA format, so each row of image data has pixels consisting of 4 bytes each - blue,green,red,alpha, blue,green,red,alpha, blue,green,red,alpha etc…

If you’re doing some image processing then you’ll probably not bother with creating a UIImage and just go straight for crunching the raw bytes.

I’ve put together a simple demo project here. The image processing is just a demo - don’t try and use if for anything important.


Related Posts

Augmented Reality on the iPhone - how to - Hey there tech enthusiasts! So, you used to rely on my old methods for employing augmented reality on an iPhone? Well, those days are past. With the release of iOS4, accessing the camera has become a breeze. Check out my latest blog post where I share the specially updated code that works seamlessly with iOS4.
Vision Kit and CoreML - In this technical tutorial, we walk through the process of wiring up the iPhone's camera to CoreML using Vision Kit, allowing us to run machine learning models against the camera input. We outline the necessary steps in creating a new Xcode project; capturing video frames; using AVFoundation, Vision, and CoreML; and dissecting the video frames using Vision magic. We also illustrate the process of running our Vision requests and displaying the expected outputs.
Reading barcodes in iOS7 - In iOS7, a previously unannounced feature allows the ability to read barcodes. Incorporating a new output for AVCapture known as AVCaptureMetadataOutput, supported formats for 1D and 2D barcodes can be read. Demonstrating the ease of use, a simple code snippet is provided to showcase how to apply this feature. Additionally, a demo project is shared for those keen to explore its functionality.
Heart Rate Free - How it works - I have received several inquiries about my heart rate app and decided to share details about it. The app uses your phone's camera to perceive the faint changes in the light coming off the flash as blood flows in and out of your finger. I have also provided the demo code and some sample code segments to explain specific functions of the app and the augmented reality blog post for modifications. After a detailed explanation of the code, I show graphical presentations of how the app tracks heartbeats. I end off with a preview image of the sample project and provide a download link.
Augmented Reality In The Browser - This blog showcases the progress of my idea to create an augmented reality Sudoku solver using technology that enables us to solve puzzles in our browser rather than with dedicated apps. I have developed an AR Sudoku solver with a simple image processing pipeline. It identifies and extracts Sudoku puzzles from pictures, recognizes each cell's numbers, solves the puzzle, and renders the solution on top of the original image. This process is accomplished by converting the image to greyscale, conducting thresholding, OCR-processing, and puzzle-solving. I've also done parallel image location and extraction. It's a technical journey that's achieved some high-quality, low-drift results, visible in the video included. The full codebase is accessible on Github. Enjoy the joy of leveraging technology to efficiently solve Sudoku puzzles!

Related Videos

Augmented Reality iPhone Sudoku Grab - Experience real-time augmented reality capture with the new version of Sudoku Grab! Learn to build your own app with detailed guidance provided in the linked article.
Vision framework and CoreML - Discover the seamless integration of Apple's Vision framework with Core ML for object identification in this astounding demonstration, and lock onto the project at GitHub.
Wio Terminal Audio Visualizer - Learn about porting an audio monitor to the WiiO terminal, featuring a built-in microphone, gyroscope, accelerometer, light sensor, and infrared emitter. Check out the simple base class used for the display and the efficient template class implementation to swap out between two different display libraries.
Tanks! for the Palm Pre and Palm Pixie - Witness the successful porting of an iPhone app to a Palm Pre Plus in this intriguing video!
Pong on the apple watch - Get a glimpse of a quick demo on playing Pong on the Apple Watch, skillfully hacked together at the iOSDevUK event! Don't miss out!
HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
Blog Logo

Chris Greening


> Image


A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

View All Posts