What is computer vision?
The British Machine Vision Association defines computer vision as “the automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images” and “involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding.” Although this sounds (and is) incredibly complex, it’s something that we do every day. When we walk into a room full of people, for example, we immediately take in the actors, setting, and dynamic of the situation—this is the “extraction”. After observing the components of the scene, we begin to figure out how they fit together—this is the “analysis”. Finally, we reach some conclusion based on the components and how they relate to each other—this is the “understanding”. All of our prior experiences (especially ones relating to the room or to the people) can be thought of as “the development of a theoretical and algorithmic basis to achieve automatic visual understanding.” So the subject of computer vision boils down to “How do we get a computer to walk into a room; observe the actors, setting, and dynamic; and then draw a reasonable conclusion based on how it all fits together?”
Money and fame
Computer vision is generating a lot of hype these days. By hype I mean excitement, speculation, and press. People love thinking about, talking about, and worrying about what will happen if highly accurate and effective computer vision becomes a widespread technology. Part of the reason it’s garnering so much attention is because it’s easier to generate hype in this day and age. Technology has helped spread information farther and faster than ever before. But the other reason that computer vision is grabbing headlines is because people are drawn to the excitement and controversy of new technology. I think most of us knew everything there was to know about the iPhone 7 well before the September launch. Every day we see headlines about 3D printing, artificial intelligence, and driver-less cars. This powerful technology news cycle has helped generate interest in computer vision.
With this new popularity has come quite a bit of money. A quick search of “computer vision” on TechCrunch yields the following headlines: “Intel buys computer vision startup Movidius as it looks to build up its RealSense platform”; “Prospera raises $7 million to put computer vision and AI to work on the farm”; “Snapchat secretly acquires Seene, a computer vision startup that lets mobile users make 3D selfies”; “Intel buys computer vision startup Itseez to improve navigation in self-driving cars”; “MedEye medicine verification tech gets $5.6M to target new markets”; “Augmented reality computer vision startup Wrnch gets $1.8M series A led by Mark Cuban”; and on, and on, and on. Ambitious startups have noticed an incredibly promising and underdeveloped market, and they’re moving quickly to plant a flag in the ground and claim it for themselves. Larger players are gobbling up new entrants in an attempt to keep pace as they figure out the implications of the new technology.
“Put some respek on my name”
Even with all of this hype, I still don’t think that we’re giving computer vision enough credit for its possible application and impact. Computer vision can be applied to almost any industry in some form or fashion. It can also be implemented in a number of different ways in order to solve problems of various size and complexity: it can be bolted-on to existing products to give them a new depth of functionality; it can be used to create new, more elegant versions of existing products; it will enable the creation of products and services previously thought to be impossible; and it has the potential to fundamentally change the way that we interact with the world.
Computer vision chips away at the necessary level of UI. If a computer can see and interpret its surroundings, it doesn’t require us to input as much data using buttons, taps, and clicks. This new way of collecting and interpreting data will result in a dramatic decrease in the Minimum Viable Interaction (MVI) that we have with products and services.
The importance of MVI
There are a number of hurdles between a customer and a product or service. These hurdles are things like price, functionality, brand reputation, and timeliness. Businesses have to adjust these factors until they’re low enough that customers’ desire for the product outweighs the difficulty imposed by the hurdles. Price is often supposed to be the most important of these obstacles—people won’t buy something if they think it’s too expensive. In actuality, the MVI of a product or service usually carries just as much weight in a customer’s decision. People will pay more to sign up for a service online if signing up for a competitor’s service would require them to talk to the company on the phone. Consumers will spend more money if they can pay with a click rather than having to input all of their credit card information each time.
Computer vision, as it continues to mature, will have a profound effect on the MVI. Much of our manual data input and screen-time will be replaced with conversations based on computer vision and audio recognition. Imagine if your interactions with your favorite brands were more like your interactions with your friends—familiar and intuitive, without the pressure and inconvenience that often comes with transactions.
Endless application, endless opportunity
Having realized computer vision’s incredible potential to transform the way we interact with businesses, the next question is “What industries can it be applied to?” Again turning to the British Machine Vision Association, they list the following as possible applications: agriculture, augmented reality, autonomous vehicles, biometrics, character recognition, forensics, industrial quality inspection, face recognition, gesture analysis, geoscience, image restoration, medical image analysis, pollution monitoring, process control, remote sensing, robotics, security and surveillance, and transportation. In all likelihood, this list will prove to be comically short. It is difficult to think of industries that would not be able use computer vision to their advantage in some way.
It will be fun to see how startups approach the challenge of computer vision. Do they build bolt-on solutions or new products? Do they tackle problems on an industry basis? Do any arrive with greater ambition than being bought out by a larger player? Startups that can move quickly and precisely in this new territory have the opportunity to change the way that both humans and computers see the world.