People want to exploit your data. You want to exploit your data. Some people think it is bad if (your and other) data is stuck in silos, and not exploited. Some people think you should have the right to keep your data private and made laws (GDPR being the latest).
So some other people now write that there is a trade off between privacy and utility - i.e.
in some sense you can quantfy the utility of the data, and you can quantify the level of privacy that the data is subjected to -
various privacy techs enforce privacy, but some are more specifically about protecting individual data from being relinked to a person that person being re-identified in the data) anonymised or making data pseudonymised - or further, by subjecting collections of data to processes like fuzzing or adding noise, to provide some level of differential privacy (so the presence or absence of an individual's data record in the aggregate, makes no difference to queries on the data (for some given query count, at least).
What's wrong with these pictures?
Let's unpick the "utility" piece - first of all, as a network person, I think of utility in terms of provider and customer. so in the internet, congestion management is a mechanism to do joint optimisation of provider utility and customer utility - the customers get the maximum fair share of capacity, the provider gets maximum revenue out of the customers for the resource they've committed. this formulation is a harmonious serendipity.
How might utility for individual data exploitation be harmonious with utility for aggregators of data?
An example might help - healthcare records can be used to compaire/discover the effectiveness of existing treatments, discover relationships between different characteristics of individuals and well-being or onset of different medical conditions (i.e. inference!). Specifically, we might train a machine learning system on the data, and that would result in a classifier, given new input about a patient, to offer diagnosis. Or we might build a model that exposes latent (hidden) variables, and even, potentially, allows causal inference. So in the healthcare arena, there's alignment between what might be done with collections of patient data, to the benefit of all the patients. But such systems might them be turned into commercial products and run on subjects who were not part of the training data set. So what is the utility of that, to the original subjects? is there data not a form of contribution for which they should have a share in the ownership of any tech derived from it? To be honest, most of the hard work in generating the softwre was in gathering/curating (cleaning/wrangling) the data. the software itself is typically often open source, and requires little or no work. In many cases, supervised learning involved expert labelling of the data (e.g. surgeons/experts looking at records/images etc, and tagging it has having evidence of some condition or other or not). Again that contribution is highly valuable. However, in this area, the presence or absence of an individual's data (especially in a very large system such as the NHS with upwards of 70,000,000 patient records). However, the value of the data, in this case, grows super-linearly with the number of records, so 1 record here or there makes no difference, a thousand or a million is where the action is. So if we posit shared ownership of systems built on this data, then the utility to individual, and to the public at large, is aligned.
If we just give up the data to for-profit exploitation, then the individual may end up paying for access to some machine learned tool, ironically trained on their own data. That's an obvious conflict.
Other data sets have diminishing returns as the amount of data gathered increases. A classic example is smart metering (water, electricity, gas etc)...original UK deployment of smart meters reported every few tens of seconds, the usage in millions of households. this is pointless. it consumes a lot of network bandwidth. the primary goal was to remove the need to have human visits to read a meter. a secondary (misguided) goal was to offer potentially smart pricing, so consumers could make dynamic decisions (or smart devices - e.g. washing machines etc) could make smart decisions to reduce cost and reduce peak demand - this is a joint optimisation. However, the metering only needs to roughly band kinds of demand - maybe a few tens or hundreds of types of consumer and their demand profile types over the day/week/year. the pricing can just be broadcast, and is unlikely to change much - indeed, off-peak pricing of utilities was developed decades ago to do this. The actual individual usage is irrelevant, except for the aggregate bill. The model can be derived from that, in fact, or from a small random sample (small compared to 35 million households).
So what price privacy? I don't see any trade off at all - you either keep your data yourself, or you share it (for a share in ...) with people who can make good use of it, but no-one else.
footnote:
a separate problem with asking people to think about a tradeoff in this space is that there's tremendous imbalance in information about what can possibly go right with what wrong (with privacy or with the price of your data). Lets just not go there at all.