How to Overcome the Harms of Excessive Data Sharing

Research exposes data markets’ inefficiencies and proposes solutions to protect users’ privacy

August 2, 2022
Big Data, Innovation
research and innovation graphic

As the use of online platforms grows, there is an increasing amount of data from individuals being collected by companies, a process that is set to expand exponentially with advances in artificial intelligence and machine learning.

While it is argued that data-driven products and services ultimately benefit users because of better customization, many are not convinced that sharing their information is a good deal. A 2019 survey by Pew Research Center, for example, showed that 81 percent of U.S. adults believe that the potential risks they face because of data collection by companies outweigh the benefits. Sixty-two percent believe it is not possible to go through daily life without companies collecting data about them.

“If you talk with machine learning and artificial intelligence experts, they will say that better services and recommendations will always compensate for the privacy loss of users,” says Ali Makhdoumi, an associate professor of decision sciences at Duke University’s Fuqua School of Business. His research, however, suggests otherwise.

Makhdoumi describes his area of expertise, market design, as an interdisciplinary field somewhere in the intersection of operations research, computer science, economics, and applied mathematics. “There's a lot of modeling involved, where we try to define the mechanics of the real world with math. And, once we have modeled the system, we try to design it optimally.”

Most of his recent research focuses on the data market. “Within data market, I'm mostly interested in the issues of privacy, especially the potential harms of big data and artificial intelligence, and how we can redesign the systems to get around them,” Makhdoumi says. That was the motivation behind the work “Too Much Data: Prices and Inefficiencies in Data Markets,” which was accepted for publication by the American Economic Journal: Microeconomics.

What your data reveals about others

In this paper, Makhdoumi and his colleagues—Daron Acemoglu, a professor of economics at the Massachusetts Institute of Technology (MIT), Azarakhsh Malekian, a professor of operations management and statistics at the University of Toronto, and Asu Ozdaglar, a professor of electrical engineering and computer science at MIT—argue that data markets are fundamentally different from other markets because of what researchers call the externality of data.

When you share information on social media, you may believe you are sharing your information only. But that is rarely the case. “Once you share your data, you are also sharing something about your friends, your family, and your colleagues as well. And they may not be willing to share that information,” says Makhdoumi.

The Cambridge Analytica scandal is an example of how the externality of data can affect people’s privacy. The firm acquired private information from more than 270,000 Facebook users who voluntarily shared their data by downloading a third-party app for mapping personality traits. Because of what their data revealed about the people in their network, the firm was ultimately able to infer relevant information about more than 50 million Facebook users.

Makhdoumi and his co-authors developed a data market model that considers the impact of these externalities. Their model revealed how data markets are bound to be inefficient. To understand why this is the case, imagine a platform willing to purchase your data. If the price is good enough, you may choose to share it. But, because your data reveals a lot of information about the people in your network, the platform may approach these people with a much lower offer. Knowing that part of their information is already out there, they may still accept the low price.

“Once we create this market for data, what happens is that, because of the data externalities, we're going to end up competing with each other as users and ultimately both of us are going to be paid a very marginal price for our data. The market is going to be inefficient,” says Makhdoumi. The researchers also analyzed this model in a scenario where multiple platforms compete to attract users and acquire their data. Even in that situation, data prices are destined to plummet, resulting in an inefficient market.

How to regulate data markets

The model made it clear that data markets require a different design. So, the researchers started to explore regulatory strategies. One alternative would be to tax data transactions. “But the only way this would work is if you had full information about the value of each user's data in this huge data market. Only then you could figure out the exact tax for each individual,” says Makhdoumi. Given the amount of information required, this strategy would be impractical. Uniform taxation would also be inefficient, the study suggested.

The authors then proposed a regulation system where data transactions would be mediated by a trusted third party. This agent would be responsible to reduce the correlation with the data of other users, minimizing the leaked information. “The idea is to somehow decorrelate the data of users, and make sure that, when you say you're going to share your data, you're sharing your data only,” says Makhdoumi. “It's not quite clear how we could achieve that. But, conceptually, if we had the technology to do that, it could potentially resolve the issue.”

In the real world, Makhdoumi notes, users are typically compensated for their data through free services, rather than receiving monetary payment. But, conceptually, things are the same. “If I use a free service, that creates the same incentive as getting paid to share my data.”

An overlooked problem

Makhdoumi believes that data privacy issues are not getting the deserved attention. In his research, he has been looking at some of the problems that may arise when companies have too much data from individual users and the ability to offer personalized recommendations based on that information. “There is a potential for what we call behavioral manipulation. They can, over time, perhaps change your tastes and your behavior without you even noticing.”

While personalized recommendations may be useful for some people, there is a fine line between improving the customer’s experience and manipulating their choices. Questionable strategies are sometimes used by companies, for example, tracking people with financial problems and targeting them with offers of loans with high-interest rates.

“We argue that there is a dark side to this as well and it is important to know these things exist and think about ways to alleviate these issues,” Makhdoumi says.

Contact Info

For more information contact our media relations team at media-relations@fuqua.duke.edu.