Covering Disruptive Technology Powering Business in The Digital Age

Home > DTA news > News > Big data privacy is a bigger issue than you think
Big data privacy is a bigger issue than you think
February 21, 2017 News big data Data Privacy


When it comes to privacy, big data analysts have a responsibility to users to be transparent about data collection and usage. Here are ways to allay users’ concerns about privacy and big data.

If you’re in the big data business, there’s a huge privacy issue that isn’t addressed as often as it should be.

The hottest privacy topic to make the headlines is the embarrassment your company will suffer if there’s a data breach. Other privacy topics that get a lot of coverage are the risk of discrimination (i.e., your algorithms show a discriminatory and illegal bias), inaccurate analysis due to fake news, and identity reverse engineering (i.e., basically undoing anonymization). While I agree these are significant issues that are exacerbated by big data, a bigger concern is what I call oracular responsibility.

Why big data is a big privacy issue

Big data analytics has the power to provide insights about people that are far and above what they know about themselves. And, as Stan Lee says, “with great power there must also come—great responsibility.” Such is the responsibility of the oracle—thus, oracular responsibility. In fairness, this problem existed before big data, but it wasn’t a huge risk until big data analytics gave us the tools and techniques to be highly accurate with our predictions.

Let’s consider the DIKW (Data, Information, Knowledge, Wisdom) Pyramid. When most people talk about data privacy, their biggest concerns are actually with data, as it would be defined in the DIKW Pyramid. My social security number is probably sitting in multiple databases out there and if one of those databases is breached, I’ll have a huge problem.

The next level of the pyramid is information; this is where we start making actionable inferences about the data. When you’re looking into understanding users’ behaviors, this is going to freak people out even more. It gets worse.

The next level up is knowledge, which is where you start connecting the dots from different areas of a user’s life—their interests, shopping habits, political views, religious views, associates, professional development.

The most sophisticated practitioners of big data analytics go all the way up the pyramid to wisdom, where this knowledge is tracked over time and curated into a very personal profile. Breach or not, most people would feel very uncomfortable knowing that someone or something knows that much about them. I consider this the biggest privacy issue faced by those practicing the dark arts of big data analytics.

Show your cards

It’s important to be fully transparent with the subjects that you study. They might be your actual customers, or they might not be. You might be analyzing one group of people for the benefit of another group of people. In any case, it’s important to be upfront with the people you study and analyze. Cathy O’Neil, former Wall Street quant and author of Weapons of Math Destruction, explains the high risks of a big data cocktail containing opacity, scale, and damage. This poison is neutralized with transparency, which clears up the opacity.

As uncomfortable as it may be, a prominent aspect of your responsibility is to be honest with your subjects. Let them know that you study them. Let them know what your analytic capabilities are. Let them know what you know about them (in general terms), where you get your information, and your analytic reasoning.

This means you shouldn’t use completely black-box techniques like neural networks. You might build the most accurate neural network in the world, but if you can’t offer up some sort of explanation or rationale around its conclusions, then you’re just as in the dark as your subjects, which is not good. I know this sounds like a lot of information to share—and it is—so you must be careful not to overdo it.

But don’t give away the farm

Don’t feel so compelled by transparency that you give away your strategic secrets. After all, you are in business—a very competitive business. If you give away too much information, your competitive value is eroded. You must find a way to be transparent, while keeping the secret sauce behind the firewall. Here’s how.

Your IT leadership team should launch proactive communication campaigns, which could include PR, speaking, social media, and outreach programs—the more the better. Explain more about what you can do than how you do it. At a minimum, it’s your responsibility to let people know what you know about them and what you’re capable of doing with your analytics. For instance, if location analytics allows to you know where they are and where they’re likely to go next, then let users know you have this technology.

You should also share your prediction accuracy; this will help with reasoning in the absence of methodology. You don’t have to disclose everything about your methods, but if parts aren’t proprietary or particularly sophisticated, let them know. For example, if you’re merging their Facebook data with their Twitter data to get a better understanding of their interests, you should share that information with them. This level of transparency won’t clear you of privacy issues, but it will go a long way to build trust with the community that needs and deserves it.

This article was originally publshed on and can be viewed in full