Since the Cambridge Analytica Facebook data scandal, the number one question I've been asked is, "What Facebook data did they get?" The second most popular question is, "What did Cambridge Analytica do with it?" Let's review:
What we've been told
According to Mark Zuckerberg, "In 2013, a Cambridge University researcher named Aleksandr Kogan created a personality quiz app. It was installed by around 300,000 people who shared their data as well as some of their friends' data. Given the way our platform worked at the time this meant Kogan was able to access tens of millions of their friends' data." Later in the same post, Zuckerberg claims that the issue surfaced again in 2015 and Facebook took action (perhaps not enough action) to bring the offending company into compliance with Facebook policies.
What Facebook data did they get?
The full answer to this question will come out in the upcoming investigations and legal proceedings. But we can make a pretty good guess at what data Cambridge Analytica obtained by looking at the endpoints of Facebook's Graph API v1.0 (application programming interface) which launched April 21, 2010, and wasn't fully closed down until April 30, 2015.
Using Facebook's Graph API v1.0, developers had unfettered access to almost all of your public-facing profile data, including:
id, name, first name, last name, link, gender, locale, timezone, updated time, verified.
The heart of the scandal focuses on one of the API's permissions groups called "Extended Profile Properties." These endpoints provided access to data about your friends that your friends may not have explicitly granted permission for Facebook to expose, specifically:
about me, actions, activities, birthday checkins, history, events, games activity, groups, hometown, interests, likes, location, notes, online presence, photo video tags, photos, questions, relationship details, relationships, religion, politics, status, subscriptions, website, work history.
In another permissions group, the API exposed an endpoint called read_mailbox which, just as it sounds, allowed developers access to users' private messages.
Wow, that's a lot of data. What can you do with it?
With tens of millions of rich data profiles, you can apply the tools of data scientific research to turn the data into action. This usually follows a disciplined methodology that starts with transforming the data into usable cohorts and clusters, learning about the propensities of the different cohorts and clusters, and then predicting how they might react to messaging by using simulations (Will a $10 discount work better than a coupon for a 10 percent discount? Is this cohort likely to vote for X?). Optimization is another common use for data (Let's optimize our media buy to drive 5 percent higher conversions by doing X. Let's get more people to vote for X). None of this is new. None of this is news. This is how internet advertising works.
As crazy as this may sound, all of the sensationalist craziness, the ~$50 billion of Facebook value wiped out by the scandal, and all the talk of espionage and intrigue are basically about the realization that your data is used to serve you highly relevant, hyper-targeted messages. Again: Not new. Not news.
If Cambridge Analytica didn't obtain this data set from Facebook, it could have simply purchased a bunch of data and used best practices mar-tech tools to accomplish its marketing goals. Not surprisingly, a fair number of digital marketers I've spoken to about this believe that Cambridge Analytica would have achieved better results by leveraging the platforms directly.
Why do I care about this?
The silver lining to the Facebook–Cambridge Analytica scandal is the fact that you are becoming aware of what data you create, what is collected, and how it is used. This is a very good thing.
When Facebook urges you to configure your privacy settings, they are using the wrong words. The company should be advising you that you are actually configuring your "Public Persona Settings" or your "Online Presence Settings" or simply your "Public Settings." Whom you share your stuff with and who gets to see it are practically meaningless in the context of this issue.
What you should be configuring is your "Personal Data Sharing Permission Settings" or "API Endpoint Permissions." These settings would restrict the data that could be extracted while we are passively going about our business online.
Way too complicated!
No. It really isn't. As a society, we have to raise our level of data maturity. Clearly Facebook has grown faster than it has matured. Societally, we are all neophytes when it comes to data literacy and data governance.
If you want control of your data, it would be a great idea to understand what that means. "They know everything about me," says someone scared out of their wits by sensationalist headlines. No. They don't. They have data that can be turned into action to enrich organizations smart enough to use it. Are you hurt by this? Could you be?
Right now, you get to use Facebook, Google, Gmail, Waze, and other "free" apps for the cost of your data. If you don't want to pay with your data, you are welcome to make other choices. But you should understand how your data is being used before making a blanket statement that your data should be 100 percent private and 100 percent protected. Data is a form of currency, and it is a valid way to pay for goods and services. Data can be (and is) transformed into cash by tech companies. But that does not diminish the value of the services they provide.
Just because you did not take the time to learn how you were paying for a tech service and you thought it was "free" doesn't mean it is actually free and, most importantly, that you would be willing to pay cash money for it.
So let's use this scandal as an opportunity to learn as much as we can about data so when we start the regulatory process, we do it in a responsible way.
Author's note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it.