by Cade Metz: Neural Networks are all the rage in Silicon Valley, infusing so many internet services with so many forms of artificial intelligence…
But as good as they may be at recognizing cats in your online photos, AI researchers know that neural networks are still quite flawed, so much so that some wonder whether these pattern recognition systems are a viable path to more advanced—and more reliable—forms of AI.
Able to learn tasks by analyzing vast amounts of data, neural networks power everything from face recognition at Facebook to translation at Microsoft to internet search at Google. They’re beginning to help chatbots learn the art of conversation. And they’re part of the movement toward driverless cars and other autonomous machines. But because they can’t make sense of the world without help from such large amounts of carefully labelled data, they aren’t suited to everything. And AI researchers have limited insight into why neural networks make particular decisions. They are, in many ways, black boxes. This opacity could cause serious problems: What if a self-driving car runs someone down and the world wants to know why?
‘What we’re interested in is automating the scientific method.’
“Deep learning has really received a lot of attention, and deservedly,” says Tuomas Sandholm, the Carnegie Mellon computer science professor who helped build Libratus, the AI that recently topped the best humans at poker—without help from neural networks. “But deep learning doesn’t give you guarantees.”
It’s true. But because of these apparent weaknesses in neural networks, some of the world’s biggest tech companies are now broadening the way they think about AI, judging from recent hires, acquisitions, and research, and many startups are moving in the same direction. You can think of this as the rise of the Bayesians, researchers that approach AI through the scientific method—beginning with a hypothesis and then updating this hypothesis based on the data—rather than just relying on the data to drive the conclusions, as neural networks do. The Bayesians look for ways of dealing with uncertainty, of feeding new evidence into existing models, of doing the stuff that neural networks aren’t all that good at.
Like neural networks, Bayesian methods can learn from data, but this breed of machine learning happens in a different way. “What we’re interested in is automating the scientific method,” says Ben Vigoda, founder of an AI startup called Gamalon, which is pushing in this direction through a technique called probabilistic programming.
It’s another reminder that the rapid rise of neural networks has also pumped life into so many other techniques that can help machines become smarter, from reinforcement learningto evolutionary computation. There are so many ways that machines can learn.
The Mystery Technology
When Gary Marcus sold his 15-person startup to Uber this past December, he arrived with a new kind of artificial intelligence. Or so he said.
His company was called Geometric Intelligence, a little operation making a big promise. The 47-year-old NYU psychology professor said that he and his fellow researchers were developing systems that could learn tasks from just a little data, much like humans do—that could exceed the powers of deep neural networks.
Small data systems, Marcus believes, are essential to building machines that can carry on a conversation or cars that can drive public roads all on their own. “There are problems in the domain of language and in driverless cars where you’re never going to have enough data to use brute force the way that deep learning does,” he said when Uber acquired Geometric Intelligence this past December. After all, you can’t crash cars on busy roads to gather data on how to avoid accidents in the future. “Either you can’t buy it or it doesn’t exist.”
Marcus and his co-founder, University of Cambridge professor of information engineering Zoubin Ghahramani, still won’t discuss the particulars of the tech they’re building. As is often the case in the world of technology—particularly with AI—this kind of secrecy creates an air of magic. But Ghahramani is one of those Bayesians. He specializes in a particular kind of statistical model called a Gaussian process—GP for short—and this likely plays a role in what he and Marcus are building.
The Gaussian Process
On one level, a Gaussian process is a way of finding the optimal solution to certain problems. Underpinning another mathematical technique called Bayesian optimization—Bayes! Gauss! Get to know your mathematicians!—GPs already help sites decide which ads they should show or what their home pages should look like. Uber has been hiring academics who specialize in Gaussian processes to improve its ride-hailing services. At Google, Gaussian processes help control the company’s high-altitude internet balloons.
Fundamentally, Gaussian processes are a good way of identifying uncertainty. “Knowing that you don’t know is a very good thing,” says Chris Williams, a University of Edinburgh AI researcher who co-wrote the definitive book on Gaussian processes and machine learning. “Making a confident error is the worst thing you can do.”
At Whetlab, a startup acquired by Twitter in 2015, the technique provided a better way of—wait for it—designing neural networks. Designing a neural network is a task of extreme trial and error. You’re not so much coding a piece software as coaxing a result from a sea of data. It’s a difficult and time-consuming undertaking, but GPs and Bayesian optimization can help automate the task. As WhetLab founder and Harvard computer scientist Ryan Adams has said, the startup used “machine learning to improve machine learning.” Neural networks can suffer from the “confident error” problem, and in identifying uncertainty, this kind of optimization can help address that problem. Adams has since left Twitter for Google Brain, the search giant’s central AI team.
Some researchers also believe that the small data powers of Gaussian processes can play a vital role in the push toward autonomous AI. “To build a truly autonomous agent, it has to be able to adapt to its environment very quickly,” says Vishal Chatrath, CEO of AI startup Prowler, who has worked with Ghahramani. “That means being able to learn in a data-efficient way.” What’s more, Chatrath says, Gaussian processes are easy to interpret. Unlike neural networks, they aren’t burdened by the black-box problem. If there’s an accident, you can track down the cause.
At Prowler, Chatrath has hired three academics who specialize in the technique. Based in Cambridge—home to Ghahramani and so many other specialists in GPs and related techniques—the company is building AI systems that can learn to navigate massively multiplayer games and other digital worlds. It’s complex work, but they hope it’s a step toward systems that can learn to navigate the real world.
Meanwhile, Amazon recently hired another major AI researcher who specializes in Bayesian techniques, University of Sheffield computer scientist Neil Lawrence. “Don’t panic,” Lawrence recently wrote in a blog post, with a nod to The Hitchhiker’s Guide to the Galaxy. “By bringing our mathematical tools to bear on the new wave of deep learning methods, we can ensure that they remain mostly harmless.”