The advertising world loves big, shiny, techy things. Agency and client ears perk right up when they hear about virtual reality kiosks, gadget-filled activations and holograms of dead rock stars. But then there are the tech innovations that sound a bit, or a lot, less sexy. Things like deep learning.
Deep learning is a subset of machine learning that essentially teaches computers to find patterns in sounds, images and other data. And while that may not seem like much fun to your average social marketer or copywriter, the tech giants—the Facebooks, Apples, Googles, Netflixes, Microsofts and Baidus of the world—are investing massive sums of money in it. For instance, Google reportedly spent more than $500 million to acquire deep learning firm DeepMind in 2014. Baidu, the Chinese smartphone giant, runs deep learning and artificial intelligence-centric R&D centers in Silicon Valley and Beijing. Apple hires deep learning experts at fever pitch.
Why the frenzy? Deep learning technology lets you unlock your phone with your thumbprint. It enables Facebook and government agencies to identify your face in pictures. And it helps Siri and Alexa understand just what the hell you're saying. Advertisers are experimenting with using deep learning to count how many passersby stare at billboards. The self-driving cars that we're told are just around the corner rely on deep learning to avoid hitting other cars. Or people.
But soon, deep learning tech will do even more. Futurists are already thinking about new (and sometimes dystopian) ways it can be used in marketing strategies. App-makers increasingly are taking the first steps to supercharging picture and image recognition with deep learning.
Deep learning has existed as an academic field of inquiry since at least the 1990s. But the massive processing power it requires resulted in limited innovation.
But a long line of hardware and software breakthroughs meant that, by the early 2010s, academics and corporations could experiment more with deep learning. Then, as with all things internet, cat videos changed everything. In 2012, a Google research team led by Jeff Dean (who's still at Google) and Andrew Ng (who went on to become Baidu's chief scientist and launched a $175 million fund for artificial intelligence in early 2018) connected 16,000 computer processors into a neural network that taught itself to recognize images of cats in a massive database of still images.
"Structured data is difficult to amass and expensive to curate, but it's the cornerstone of supervised learning [the most common kind of machine learning]," says Cambron Carter, computer vision technology lead at GumGum, an artificial intelligence company based in Los Angeles. "Ng, Dean, et al. managed to train a network to recognize cats, faces and human-esque structures from raw YouTube thumbnails. They decided to feed the unsupervised, data-hungry beast what it needed: data."
Following Google's breakthrough, teams at companies like Facebook, IBM and Microsoft all made further advances that made deep learning cheaper and easier to integrate into consumer tech products and services.
For instance, Baidu released research in 2016 that used face recognition instead of tickets to gain access to events, and voice data from Alexa has opened a rich new data cache for Amazon. Marketing and advertising firms are already using deep learning techniques to extrapolate data from things like Instagram images and YouTube videos.
The deep learning arms race
Attaching a dollar count to deep learning isn't easy. Even publicly traded corporations like Google, Facebook and Microsoft are cagey about exactly how deep learning makes it into their current products and the R&D they're planning for the future.
But the bread crumbs that have reached the public—research papers, presentations at conferences and the like—give an idea of deep learning's future use. And they tend to read like some techy fantasia: Self-driving cars. Virtual reality. Apps that let the blind navigate the world and that translate English into Urdu in real time.
Nvidia, a hardware firm whose graphic processing units (GPUs) have seen booming sales as a result of the rise of deep learning, lists uses on its website that range from Adobe's DeepFont, which identifies the fonts used in an image, to the National Center for Supercomputing Applications, which detects gravitational waves millions of miles away in real time.
"Deep learning forms the foundation of new AI-powered user experiences like image classification, natural language processing and search recommendations," Jim McHugh, Nvidia's VP of enterprise products, tells Ad Age by email.
Deep learning, at its root, lets computers do things they already do, only better. Cleverly executed deep learning algorithms mean systems can recognize which cars show up most often in Instagram pictures or track how long a viewer watches a television ad before she gets bored.
Because there's a relatively small community of machine learning experts, large tech companies have been uncharacteristically open about their deep learning research. Researchers at companies like Baidu, Facebook and Microsoft are encouraged to publish papers; even hyper-secretive Apple publishes its own Machine Learning Journal on the subject.
Apple and other vendors are trying their best to help software-makers integrate deep learning functionality into apps. For instance, Microsoft offers a free Cognitive Toolkit that, according to a Microsoft representative, is designed to help create enterprise-ready AI by letting users create, train and evaluate their own neural networks. Microsoft says the tool kit is for use cases ranging from the Chesapeake Conservancy's training of neural networks to speed up data analysis of wild spaces to a Chinese firm, AdDoc, whose tech rapidly detects the onset of diabetes complications.
Is that AI in your pocket or...?
Right now, deep learning tech is mostly leveraged on faraway cloud servers. Apple famously tries to keep as much of Siri's back end as possible on iPhones, but Alexa's voice recognition and Facebook's facial detection all rely on distant servers.
That all may change with the next generation of smartphones. Over the coming years, software developers will vastly improve the local image and voice analysis capabilities of smartphones. And that means everything from full-featured photo and video editing software on mobile devices to reliable disease diagnosis via phone camera to 24/7 processing (opt-in, we hope) of audio captured by phones.
Speaking of app-makers and social media platforms, Mike Gualtieri of Forrester Research notes,
"They're going to start pushing down capabilities to the mobile phone. First, for anything to do with images and then anything to do with voice recognition."
Because deep learning is iterative and learns from mistakes (every time Amazon Alexa misunderstands your query or Google Photos mislabels someone in a picture, the app gets better), tech that leverages deep learning is going to become a lot more powerful in coming years. "It's just going to get better and better and better if you understand the nature of how deep learning works," says Gualtieri.
GumGum's Carter also emphasizes changes to mobile hardware and operating systems: "Devices will be running significantly more instances of neural networks, which are required by a number of OS features and third-party applications, moving forward. Embedded and on-the-edge processing will also continue to become increasingly more popular as smartphone hardware becomes more powerful."
Smartphones have already transformed the world in a million little ways, whether by the omnipresent cameras that are now the backbone of a narcissistic culture or by rendering long-distance charges a thing of the past or by fueling revolutions in gaming and e-commerce. Deep learning is part of that transformative process, and it's only just begun.