Crashing Through and The Man Who Dared to See, tells the story of Mike May, who lost vision at age three and recovered it partially after forty years of blindness.
How can we use Mike’s story to redefine our computer vision models and give vision perception to robots and autonomous cars ?
AI can’t see like humans…yet.
State of the Art computer vision models still struggle with unsupervised learning and limited training data.
For instance, one shot learning, which aims to build an embedding space that represents a certain type of data (human faces, animals, etc), has proven efficient for specific tasks. However, it still lacks the intelligence of human brain for general object detection, depth estimation and perception.
Touch to See
Mike May lost vision at age three, after a chemical accident that happened in his parent’s garage. He went on and recovered partial vision after more than 40 years of total blindness.
Amazingly, children are born with hundred of billions of neurons. These are modeled and assigned to neural networks during your whole life.
Nonetheless, they are the most powerful and large when you are a kid. Take a language as an example. Why is there a difference between being fluent and native ? It’s called Age.
Getting older decreases the number of neurons (you lose some whenever you bang your head, accidentally or purposely), hence the difficulty for your brain to build strong neural networks to capture a new language effectively.
This is what happened to Mike May . Even though he physically recovered vision at hundred percent, he couldn’t see like someone who grew up with vision. This actually surprised and puzzled the neuroscience and vision community at the time (early 2000s).
Ione Fine, who worked on the effects of long term visual deprivation on human visual processing, showed it results in deficits in high level visual processing (such as face and object recognition), partly with the help of Mike May and the experiments she ran on him.
The neurons we assign to face and object recognition during our childhood, are missing for Mike May. You may wonder where they are ?
Well, if you watched Daredevil, you may have the answer already. Even though we cannot compare Mike May to a superhero, blind people develop stronger hearing, touch, smell and taste than most people do. This is because they assigned more neurons to it.
This is how amazing the brain is !
Machine Vision Perception
A few months after physical vision recovery, Mike missed a sidewalk and falls in the street. He didn’t miss it because he didn’t see it though. He missed it because he just saw a flat picture of a road, a line separation and another flat grey zone that looks like the road continuity. Without the depth information, it is obviously confusing if this new road is a sidewalk or not.
My feeling is we are currently at the same stage Mike was in after his recovery. Yet, machines are not limited by age or an initial neurons population. So how do we get to full machine vision ?
This amazing story got me thinking about our current models limitation, especially at depth estimation and general object detection. Maybe our neural networks lack the same complementary information than Mike May’s brain : early age experience.
Touch and Experience are the main drivers of our vision perception. Not only do we grow up touching everything we can, but we also memorize every object and item we encounter.
Now, we need to translate Touch and Experience into models computers can understand.
Stay tuned, as I start testing those assumptions.
I will share the experiments and results along the way.