A while back, we proposed a Sourceless Network Architecture. The notion was that, given the end-to-end argument suggests only putting things in a layer if everyone above that layer wants them, and that there are such things as "send and forget", where we don't expect an answer, then why does a recipient need to know where the packet came from? and if it doesn, the source can be put in the packet, perhaps as a name, so that if the source moves, the recipient has a better chance to still reply.

Now why do we need a destination address? This recent CACM article on metadata suggests using ToR type systems - but these use crypto and onion layered re-encryption to obfuscat the source and destination from third party observers. Why put the destination address in at all? why not just put the packet in a bottle, and throw it in the sea, to wash up on some beach where someone can take it out of the bottle, decrypt, and maybe answer the same way?

All we need is an Internet Sea with lots of  Internet Beaches. That cannot be too hard.

Wednesday, September 18, 2019

taxing the cloud

most ai runs in the cloud.

people are proposing a tax on the cloud.

so ai should have representation (no T without R, right?). votes for ai, now.

and more, can an ai commit a sin? if so, can we sell it an indulgence?
ai, go to hell now.

Monday, September 16, 2019

cryptocurrency and the singularity

humanity uploads itself to cyber-physical systems (aka robots) so it can swarm across the stars ahead of the heat death of the Universe. Neo-liberals, being the first to have the resources to do this, decide to implement an economic incentive system based on cryptocurrencies to make sure that the robots will spend some of their time working on mining spare parts (especially selenium for their stellar cells).

sadly, the proof-of-work in mining the currency consumes more energy than they can harvest in time, and crypto-humanity dies out without even leaving the asteroid belt.

Tuesday, September 10, 2019

the myth of the privacy/utility trade-off

People want to exploit your data. You want to exploit your data. Some people think it is bad if (your and other) data is stuck in silos, and not exploited. Some people think you should have the right to keep your data private and made laws (GDPR being the latest).

So some other people now write that there is a trade off between privacy and utility - i.e.
in some sense you can quantfy the utility of the data, and you can quantify the level of privacy that the data is subjected to -

various privacy techs enforce privacy, but some are more specifically about protecting individual data from being relinked to a person that person being re-identified in the data) anonymised or making data pseudonymised - or further, by subjecting collections of data to processes like fuzzing or adding noise, to provide some level of differential privacy (so the presence or absence of an individual's data record in the aggregate, makes no difference to queries on the data (for some given query count, at least).

What's wrong with these pictures?

Let's unpick the "utility" piece - first of all, as a network person, I think of utility in terms of provider and customer. so in the internet, congestion management is a mechanism to do joint optimisation of provider utility and customer utility - the customers get the maximum fair share of capacity, the provider gets maximum revenue out of the customers for the resource they've committed. this formulation is a harmonious serendipity.

How might utility for individual data exploitation be harmonious with utility for aggregators of data?
An example might help - healthcare records can be used to compaire/discover the effectiveness of existing treatments, discover relationships between different  characteristics of individuals and well-being or onset of different medical conditions (i.e. inference!). Specifically, we might train a machine learning system on the data, and that would result in a classifier, given new input about a patient, to offer diagnosis. Or we might build a model that exposes latent (hidden) variables, and even, potentially, allows causal inference. So in the healthcare arena, there's alignment between what might be done with collections of patient data, to the benefit of all the patients. But such systems might them be turned into commercial products and run on subjects who were not part of the training data set. So what is the utility of that, to the original subjects? is there data not a form of contribution for which they should have a share in the ownership of any tech derived from it? To be honest, most of the hard work in generating the softwre was in gathering/curating (cleaning/wrangling) the data. the software itself is typically often open source, and requires little or no work. In many cases, supervised learning involved expert labelling of the data (e.g. surgeons/experts looking at records/images etc, and tagging it has having evidence of some condition or other or not). Again that contribution is highly valuable. However, in this area, the presence or absence of an individual's data (especially in a very large system such as the NHS with upwards of 70,000,000 patient records). However, the value of the data, in this case, grows super-linearly with the number of records, so 1 record here or there makes no difference, a thousand or a million is where the action is. So if we posit shared ownership of systems built on this data, then the utility to individual, and to the public at large, is aligned.
If we just give up the data to for-profit exploitation, then the individual may end up paying for access to some machine learned tool, ironically trained on their own data. That's an obvious conflict.

Other data sets have diminishing returns as the amount of data gathered increases. A classic example is smart metering (water, electricity, gas etc)...original UK deployment of smart meters reported every few tens of seconds, the usage in millions of households. this is pointless. it consumes a lot of network bandwidth. the primary goal was to remove the need to have human visits to read a meter. a secondary (misguided) goal was to offer potentially smart pricing, so consumers could make dynamic decisions (or smart devices - e.g. washing machines etc) could make smart decisions to reduce cost and reduce peak demand - this is a joint optimisation. However, the metering only needs to roughly band kinds of demand - maybe a few tens or hundreds of types of consumer and their demand profile types over the day/week/year.  the pricing can just be broadcast, and is unlikely to change much - indeed, off-peak pricing of utilities was developed decades ago to do this. The actual individual usage is irrelevant, except for the aggregate bill. The model can be derived from that, in fact, or from a small random sample (small compared to 35 million households).

So what price privacy? I don't see any trade off at all - you either keep your data yourself, or you share it (for a share in ...) with people who can make good use of it, but no-one else.

a separate problem with asking people to think about a tradeoff in this space is that there's tremendous imbalance in information about what can possibly go right with what wrong (with privacy or with the price of your data). Lets just not go there at all.

Tuesday, September 03, 2019

The Myth of the Reliable Narrator

Post-modern literature (and film) abounds with wacky framing devices including the authorial voice and the unreliable nature. This is all messing with the suspension of disbelief and dramatic irony that is the very life of fiction. But fiction is all a lie. There never was a reliable narrator. The book has a cover (a beginning, an end, a narrative arc etc). A film has those weirdest of things, music; pov; zoom/pan etc etc. The author/director/actors may all be dead by the time you read/view this.

So take every story with a grain of salt. or a large G&T. or a deep breath.

Even this brief note is utterly untrustworthy.

Friday, June 14, 2019

redecentralization 2.0

It has been a core thee of a lot of my work & interests. decentralized systems are just more interesting than centralized ones. they may be inherently more resilient (but not always), and they may be more complex (but not always).

the internet is largely decentralized in its lower layers (the tubes - the routers and links, and routing algorithms). that was always intended, from baran's report for rand onwards.
society and eco-systems are often decentralized (sure there are governments (but more than 1) and bee hives (but more than one) - but coordination happens peer-to-peer (a term which first arose in magna carta, but an idea which predates that by a billion years).

decentralized, infrastructureless networks are an interesting point in the design space - hence community mesh wireless networks, and opportunistic, delay and disruption tolerant networks work merely using users' devices and construct communication out of thin air. in this extreme environment, we are challenged to think of how we provide information about identity or trustworthiness, but in fact, on close examination, a central provision of those properties has many problems too - DNS certificates can be bogus or expired, source IP addresses do not have to refer to where the packet came from, an application layer user identifier (email address, facebook identity etc) is no more a true name than the Prince of Serendip.

so really, everything should be decentralized, as it forces us to confront the true problems and come up with decent solutions, instead of using the prop of underserved respectability of a centre.

That's why we founding the centre for redecentralization. :-)

Thursday, June 13, 2019

future of work & AI

so techno-optimists paint a rosy future image with AI freeing us up from toil to have a life of infinite leisure.

lets go back to the victorian times and the industrial revolution - what happened? machines (steam engines etc) meant that food (and transport) no longer required most people to grow what they eat, or feed the horses - so most people should have been able to get free food or travel to the seaside for a dip. what happened? most people moved from fairly pleasant rural existence farming to working in the dark satanic mills - i.e. became urban factory workers with longer hours and shorter, less pleasant lives.

lets go back to when people stopped being hunter-gatherers and settled down to farming. could have been nice to stop worrying about the days when you stop being predator and become prey from time to time. but what happened? people built nation states and priesthoods and aristocracies and invented serfs/slavery.

so techno-pessimists paint a dystopic future picture with AIs enslaving us (or just disposing of us).
That's nonsense too.

So how will things play out? what of all this "makework" that mot of us in the developed world engage in that is trivially automatable (actually, doesn't need doing)?

I have no idea, but we better figure it out soon.

[1] homework

