The Brutal Truth About 'Big Data'
Sure, "big data" -- the catchphrase and the reality of it -- is everywhere, but do you always recognize a big-data moment when you see it? Or, for that matter, when it taps you on the shoulder, spins you around, slaps you in the face and punches you in the stomach?
Last month, Britain's Guardian newspaper published a leaked video on its website from defense contractor Raytheon. In it, a Raytheon staffer casually demonstrated its eerily named RIOT -- Rapid Information Overlay Technology -- system. As The Guardian put it, "Using RIOT, it is possible to gain an entire snapshot of a person's life -- their friends, the places they visit charted on a map -- in little more than a few clicks of a button." RIOT does so by mining public social-media posts and even information hidden in photographs (e.g., some smartphones automatically record the exact location a photo was taken -- latitude and longitude tucked in the "EXIF header data" -- and that embedded information can stay attached to an image indefinitely). Using a pattern-recognition algorithm, RIOT can even parse Foursquare check-ins to predict where an individual might be at any given time.
The RIOT revelation was, of course, a big-data moment -- a blatant and easily recognizable one, at that.
Coincidentally, the same day The Guardian posted that leaked video, The New York Times published a review of Tesla Motors' Model S electric car. The reporter who wrote it trashed the vehicle for having inadequate driving range in cold weather, and then Tesla's CEO went ballistic on him, citing data collected by the test car to challenge the reporter's assertions about how he actually handled the vehicle.
The media-feud aspect got all the attention, but at its core the story was about data; we wouldn't have had the fuel for the feud if it wasn't for the fact that products increasingly record astonishing amounts of information. In this case, a car was all too ready to tattle to a marketer (Tesla) about how a consumer (the Timesman) was using it. Tesla's data showed when and how long he charged the car, how fast he drove it, where he was at any given moment, and even how much or how little he cranked up the heat.
The thing about such stories is that they're both vaguely scary and rather misleading. Scary because of the obvious Big Brother subtext (e.g., it's not too much of a stretch to imagine that RIOT could integrate nicely with the Obama administration's drone-assassination program; instead of a "like" button, picture a "kill" button). And misleading because big data rarely works so well. Generally, it's pretty much Too Much Data and/or Useless Data and/or Inaccessible Data and/or Nobody Knows Quite What To Do With It Data and/or ... you get the idea.
I've been thinking about that reality lately because for weeks now, as I surf the web, I keep noticing ads for the Museum of Modern Art and Pop Shop -- pop-shop.com, which sells Keith Haring merchandise (you know, the "radiant baby" artist from the "80s).
Like everyone, I tell myself that I don't really notice display ads on the web, but I always realize I'm fooling myself when I notice I'm being stalked by ads. In this case, the MoMA ads are getting served up wherever I go because I recently went to its website to check its hours (I live in New York and was trying to plan a visit). As for Pop Shop, I recently bought a cool-looking Keith Haring shower curtain as a housewarming gift for a friend who's a Haring fan; I only thought to do so because it was a featured sale item in a Fab.com email blast I'd received. Because I'm a cheap bastard, I dropped in on pop-shop.com (via a Google search) to make sure that Fab's price was, in fact, a good deal. It was, so I went back and bought it at Fab.com.
The thing is, I'm not going to be buying any more Keith Haring merchandise any time soon, and MoMA already has my money; I renew my membership every year. But my "interest" in both the Pop Shop and MoMA has been duly recorded -- and obviously my web-browser cookies are on high alert -- so both organizations keep pointlessly spending money serving up seemingly "relevant" data-enabled, hyper-targeted ads as I traverse the web.
What I might actually do with my money, what I've done with it in the past (e.g., my personal point-of-sale, credit-card and loyalty-program records), what other artists I'm interested in, what other cultural activities I engage in, what I've tweeted about, what I've liked on Facebook, how much disposable income I have at any given time, what sort of offers I'm most likely to react to -- all of that information, I know, is floating around in the big-data cloud. But it only rarely meshes together in any sort of meaningful way that improves my life (and/or improves anybody's bottom line).
Meanwhile, big data keeps getting bigger and bigger (for starters, just wait until Google Glass and related technologies start recording more and more of what we all do "IRL" -- in real life). Essentially, marketers keep constructing bigger data haystacks without necessarily getting any better at figuring out how to find the needle, or what to do with the needle when they find it.
And keep in mind that fresh privacy-breach scandals along the way will likely provoke more and more (justified) consumer paranoia and further fuel the whole do-not-track movement (which, of course, is ultimately futile; just ask Raytheon).
Obviously, one thing to do in the face of the mind-bogglingly complex, all-consuming big-data explosion is panic.
And then, when you've calmed down, I suggest you read the Data Issue of Ad Age -- which, oh, here it is right here. (How convenient!) My colleagues have spent countless hours calmly digging deep into the topic from every conceivable direction to make sense of it all. We've got reports about the new data stewards at marketers like Kraft and Jim Beam; dispatches from the war over data control (from Facebook to Twitter and beyond); reports on how universities are trying to train the next generation of data talent; stories on marketers who "hoard" data, and lots more.
Of course, you still may be inclined to panic. But my data indicate that, after reading the entire Data Issue, you'll be anywhere from 42.8% to 73.6% less likely to feel overwhelmed by the big-data deluge -- which I was able to calculate by plotting your demographic variables against the mean, and then modeling a regression analysis ... uh, I better stop right there. Did I mention my algorithm is proprietary?