Marita Cheng develops AI to help the blind experience the world

Marita Cheng
Marita Cheng

Melbourne School of Engineering alumna Marita Cheng is working in Silicon Valley on a new app that allows blind people to identify the world directly in front of them using machine vision technology.

A Mechatronics and Computer Science graduate, Ms Cheng founded international organisation Robogals in 2008, to promote engineering and technology careers to schoolgirls. She was awarded Young Australian of the Year in 2012 and was also supported by the Melbourne Accelerator Program in 2014, to launch her robotics startup 2Mar Robotics.

Ms Cheng is currently working at NASA Research Park in Silicon Valley, with Singularity University, which was founded by renowned inventor and futurist Ray Kurzweil. She co-created her latest app, Aipoly, which is an intelligent assistant for the visually impaired that empowers them to explore and understand their surroundings through computer vision and audio-feedback.

Ms Cheng says her app complemented the work of Kurzweil, who was a pioneer in blind technology having created optical character recognition (OCR) and the first text-to-speech synthesizer.

“In every focus group, people mention a Kurzweil technology they use to get about their daily lives,” she says.

The app allows the user to take a picture that is automatically uploaded to the Aipoly servers, where it is analysed and tagged, and a description is sent back and read out loud using text-to-speech.

Ms Cheng says this means that blind people may be able to see what their children are wearing each day, recognise street signs, find objects that are out of reach and more.

Aipoly in action.
Aipoly in action.

“The power is in helping us construct the mental picture. And not everybody has the same skill at creating mental images,” says Steve Mahan, president of the Santa Clara Blind Centre and Google’s self-driving car’s first user.

“Most of us are trying to do [that]. Knowing where we are is sometimes more than an address,” he says.

The machine vision algorithm is optimised for use by the visually impaired with training in street signs and objects commonly used by blind people.

Machine vision, or computer vision, is an exponential technology that has more than doubled in accuracy between 2012 and 2013. Convolutional neural networks are used to identify the elements within a picture and neural image caption generation to feed back a semantic description of its content.

There are 285M visually impaired people in the world and in the next 5 years, two thirds of them will become smartphone users.

Aipoly is currently looking for beta testers from around the world. Beta testers of all visual abilities (including fully sighted and blind) can sign up to test the software at the Aipoly website.