Saturday, December 07, 2019

Cambridge Comprehensive

Recently, several new members have joined the department, and happened to ask me how everything worked. I had to disappoint them, in that the last person who knew that was Stephen Hawking and he'd sadly died just before they arrive. However, have now been here for 20 years, this time at least, I thought I would have a go at explaining stuff

People are classified as students or UFOs - students are initially manifold, until they expurgated, at which point they can become UFOs. UFOs become UTOs when they are established through ground truths. UTOs can also later become fellows, provided they pass the rigorous exam in Benevolent Dictation. This then qualifies them to say grace and hand out favours such as maundy money, and to hold hands as they walk on the college lawn.

Colleges are basically country houses with nice lawns and  staircases, which only UFOs and UTOs are allowed on. students have to make their way to and from the bars and bedders by way of the outside walls, often climbing up precarious ivy. Over the 8000 years of existence, students have evolved to have primitive wings, but when they become UFOs, they lose the feathers on the wings, and so make do with gowns instead, to cover up their shame.

Departments are a relatively new invention, and are basically knowledge stores, a bit like John Lewis, except that departments are never unknowingly undersold. Other fleeting Institutions such as the Sanger and the Turing have no salience whatsoever.

Colleges are basically country houses with nice lawns and  staircases, as described above - heads of houses dispense classes in benevolent dictation over port and salud.

The University is an act of collective illusion, and (like oxford) only exists in the minds of people who have read law. Tourists arriving at Cambridge station often ask for directions to the University, and as an act of kindness, are usually pointed to the busker outside Great St Mary's church, with the added explanation that this is the Bishop of Ely who is deemed to have progressed beyond all forms of dictation, so that now he is allowed to sing Bach's Aegrotat in the original Welsh.

I hope that this has helped.

Thursday, December 05, 2019


dani had been increasingly frustrated in his relationship, conversations always seemed to end up in arguments, and increasingly frequently, he would lose the argument. his partner seemed to anticipate what he would say, but then (deliberately?) misinterpret it. Even more online than in RL. he decided that it was time to do something about it.

being technically inclined, dani decided to tackle the challenge scientifically.
first of all, he had to understand how the arguments proceeded, so he started to record all the conversations via his smart phone, and then transcribe the speech to text.he then found some open source NLP software that could storify the text, extracting and abstracting the trending topics and the sense and sentiment in the speakers' utterances. then he thought, "why be too clever", why don't i just apply predictive text to the line of argument that I am taking, then invert the sense, and use text-to-speech to replace what I was going to say". indeed, he thought, why not automate both sides of the argument - he'd read about Generative Adversarial Networks in AI, and decided to build his own, dubbed Trouble and Strife (actor and critic).

The technology was a marvellous success, and arguments dissipated, evaporated before they even got going, life was wonderful again, harking back to the early days of their relationship.

then suddenly, out of the blue, he was served divorce papers by his partner's lawyer. and not just separation, but a demand for a massive amount of money that he had no idea he had.
It turned out that mani had known all along about the tech, and had built a massively successful business selling the software, initially to divorce lawyers, and later to barristers and judges, one of whom he ended up getting together with. Oh, and the audio recordings of mani, that dani's software had trained on initially? that was a mashup of snippets of alexa and siri arguing about which of them their owner was speaking to (although curiously, both voice assistants referred to "pet" rather than owner).

still, half of a lot of money is still a lot of money.

Monday, October 28, 2019

the new precariat

I've paraphrased William Gibson in the past - "the future is already here, just it is unfairly distributed".

People (Russell) worry about the way AI may dehumanise us. The less alarmist position (than the AI's will kill us all) might be welcome, but it is still quite a depressing image - the assumption is that that which makes many of us human (trivia, gossip, ephemera) will be automated away from us, and our humdrum existences will become less and less pointful but also that the grand creative goals some of us might set ourselves, will also increasingly fall to the machines. In this world, the human race becomes more and more de-motivated and dispirited. As if this isn't already true - they seem to have missed a  hundred years on work on alienation and the pointlessness of work post-industrial revolution, driven by time-and-motion studies, treating people as pluggable components (the sickness behind the phrase "Human Resources").

The reality will be much more of the same - a mandarin class which already exists will just get stronger-  people that program the AI, can hack the machine in the ML, will be the new hedge fund managers and political manipulators - everyone else will join the new precariat in larger and larger numbers, fed and watered and numbingly entertained just enough to stop them revolting. Maybe that is what they are saying ...Maybe I should read the book:-)

So what's the solution? I've said it before - it is in SF literature (just like all the climate change writing for 50 years) - we need (thanks to Frank Herbert in Dune) a Butlerian Jihad. Not to get rid of machines, but to stop them usurping the charming little nonsense that makes people human. and the challenge of working stuff out in one's head (whether its arithmetic or harmony).

Friday, October 18, 2019

driven to abstraction

Computer scientists sometimes say that their true discipline is about abstraction (modularisation, recursion, layering, isolation, information hiding, denotational semantics, etc)

but what if this is something more fundamental - what if the laws of the universe are layered, so there is a lawyering abstraction?
we learn mechanics, then gravity and acceleration and frames of reference, then fields, then waves and quanta - what if these aren't just pedagogic tools for making scientific progress[1] by continually improving our models of the universe? what if the laws of the universe actually a series of better approximations? What if, as some people say, we live in a simulation, and we're just witnessing progressive rendering by different physics engines?

What other novel forms of abstraction might we envisage?

Well I can think of two simpler ones:

  1. The power/late ratio for binding - the later someone is to a meeting, the more power they probably have...
  2. The infinite number of interpretations possible for the performance by an abstract impressionist (was it Donald Trump or was it Cameron's pet pig? or was it a pink salmon riding a bicycle) - Rorschach was just scratching the surface.

[1] belief in progress is an abstraction of the complex effects of dementia.

Monday, September 23, 2019


A while back, we proposed a Sourceless Network Architecture. The notion was that, given the end-to-end argument suggests only putting things in a layer if everyone above that layer wants them, and that there are such things as "send and forget", where we don't expect an answer, then why does a recipient need to know where the packet came from? and if it doesn, the source can be put in the packet, perhaps as a name, so that if the source moves, the recipient has a better chance to still reply.

Now why do we need a destination address? This recent CACM article on metadata suggests using ToR type systems - but these use crypto and onion layered re-encryption to obfuscat the source and destination from third party observers. Why put the destination address in at all? why not just put the packet in a bottle, and throw it in the sea, to wash up on some beach where someone can take it out of the bottle, decrypt, and maybe answer the same way?

All we need is an Internet Sea with lots of  Internet Beaches. That cannot be too hard.

Wednesday, September 18, 2019

taxing the cloud

most ai runs in the cloud.

people are proposing a tax on the cloud.

so ai should have representation (no T without R, right?). votes for ai, now.

and more, can an ai commit a sin? if so, can we sell it an indulgence?
ai, go to hell now.

taxing the cloud

most ai runs in the cloud.

people are proposing a tax on the cloud.

so ai should have representation (no T without R, right?). votes for ai, now.

and more, can an ai commit a sin? if so, can we sell it an indulgence?
ai, go to hell now.

taxing the cloud

most ai runs in the cloud.

people are proposing a tax on the cloud.

so ai should have representation (no T without R, right?). votes for ai, now.

and more, can an ai commit a sin? if so, can we sell it an indulgence?
ai, go to hell now.

Monday, September 16, 2019

cryptocurrency and the singularity

humanity uploads itself to cyber-physical systems (aka robots) so it can swarm across the stars ahead of the heat death of the Universe. Neo-liberals, being the first to have the resources to do this, decide to implement an economic incentive system based on cryptocurrencies to make sure that the robots will spend some of their time working on mining spare parts (especially selenium for their stellar cells).

sadly, the proof-of-work in mining the currency consumes more energy than they can harvest in time, and crypto-humanity dies out without even leaving the asteroid belt.

Tuesday, September 10, 2019

the myth of the privacy/utility trade-off

People want to exploit your data. You want to exploit your data. Some people think it is bad if (your and other) data is stuck in silos, and not exploited. Some people think you should have the right to keep your data private and made laws (GDPR being the latest).

So some other people now write that there is a trade off between privacy and utility - i.e.
in some sense you can quantfy the utility of the data, and you can quantify the level of privacy that the data is subjected to -

various privacy techs enforce privacy, but some are more specifically about protecting individual data from being relinked to a person that person being re-identified in the data) anonymised or making data pseudonymised - or further, by subjecting collections of data to processes like fuzzing or adding noise, to provide some level of differential privacy (so the presence or absence of an individual's data record in the aggregate, makes no difference to queries on the data (for some given query count, at least).

What's wrong with these pictures?

Let's unpick the "utility" piece - first of all, as a network person, I think of utility in terms of provider and customer. so in the internet, congestion management is a mechanism to do joint optimisation of provider utility and customer utility - the customers get the maximum fair share of capacity, the provider gets maximum revenue out of the customers for the resource they've committed. this formulation is a harmonious serendipity.

How might utility for individual data exploitation be harmonious with utility for aggregators of data?
An example might help - healthcare records can be used to compaire/discover the effectiveness of existing treatments, discover relationships between different  characteristics of individuals and well-being or onset of different medical conditions (i.e. inference!). Specifically, we might train a machine learning system on the data, and that would result in a classifier, given new input about a patient, to offer diagnosis. Or we might build a model that exposes latent (hidden) variables, and even, potentially, allows causal inference. So in the healthcare arena, there's alignment between what might be done with collections of patient data, to the benefit of all the patients. But such systems might them be turned into commercial products and run on subjects who were not part of the training data set. So what is the utility of that, to the original subjects? is there data not a form of contribution for which they should have a share in the ownership of any tech derived from it? To be honest, most of the hard work in generating the softwre was in gathering/curating (cleaning/wrangling) the data. the software itself is typically often open source, and requires little or no work. In many cases, supervised learning involved expert labelling of the data (e.g. surgeons/experts looking at records/images etc, and tagging it has having evidence of some condition or other or not). Again that contribution is highly valuable. However, in this area, the presence or absence of an individual's data (especially in a very large system such as the NHS with upwards of 70,000,000 patient records). However, the value of the data, in this case, grows super-linearly with the number of records, so 1 record here or there makes no difference, a thousand or a million is where the action is. So if we posit shared ownership of systems built on this data, then the utility to individual, and to the public at large, is aligned.
If we just give up the data to for-profit exploitation, then the individual may end up paying for access to some machine learned tool, ironically trained on their own data. That's an obvious conflict.

Other data sets have diminishing returns as the amount of data gathered increases. A classic example is smart metering (water, electricity, gas etc)...original UK deployment of smart meters reported every few tens of seconds, the usage in millions of households. this is pointless. it consumes a lot of network bandwidth. the primary goal was to remove the need to have human visits to read a meter. a secondary (misguided) goal was to offer potentially smart pricing, so consumers could make dynamic decisions (or smart devices - e.g. washing machines etc) could make smart decisions to reduce cost and reduce peak demand - this is a joint optimisation. However, the metering only needs to roughly band kinds of demand - maybe a few tens or hundreds of types of consumer and their demand profile types over the day/week/year.  the pricing can just be broadcast, and is unlikely to change much - indeed, off-peak pricing of utilities was developed decades ago to do this. The actual individual usage is irrelevant, except for the aggregate bill. The model can be derived from that, in fact, or from a small random sample (small compared to 35 million households).

So what price privacy? I don't see any trade off at all - you either keep your data yourself, or you share it (for a share in ...) with people who can make good use of it, but no-one else.

a separate problem with asking people to think about a tradeoff in this space is that there's tremendous imbalance in information about what can possibly go right with what wrong (with privacy or with the price of your data). Lets just not go there at all.

Tuesday, September 03, 2019

The Myth of the Reliable Narrator

Post-modern literature (and film) abounds with wacky framing devices including the authorial voice and the unreliable nature. This is all messing with the suspension of disbelief and dramatic irony that is the very life of fiction. But fiction is all a lie. There never was a reliable narrator. The book has a cover (a beginning, an end, a narrative arc etc). A film has those weirdest of things, music; pov; zoom/pan etc etc. The author/director/actors may all be dead by the time you read/view this.

So take every story with a grain of salt. or a large G&T. or a deep breath.

Even this brief note is utterly untrustworthy.

Friday, June 14, 2019

redecentralization 2.0

It has been a core thee of a lot of my work & interests. decentralized systems are just more interesting than centralized ones. they may be inherently more resilient (but not always), and they may be more complex (but not always).

the internet is largely decentralized in its lower layers (the tubes - the routers and links, and routing algorithms). that was always intended, from baran's report for rand onwards.
society and eco-systems are often decentralized (sure there are governments (but more than 1) and bee hives (but more than one) - but coordination happens peer-to-peer (a term which first arose in magna carta, but an idea which predates that by a billion years).

decentralized, infrastructureless networks are an interesting point in the design space - hence community mesh wireless networks, and opportunistic, delay and disruption tolerant networks work merely using users' devices and construct communication out of thin air. in this extreme environment, we are challenged to think of how we provide information about identity or trustworthiness, but in fact, on close examination, a central provision of those properties has many problems too - DNS certificates can be bogus or expired, source IP addresses do not have to refer to where the packet came from, an application layer user identifier (email address, facebook identity etc) is no more a true name than the Prince of Serendip.

so really, everything should be decentralized, as it forces us to confront the true problems and come up with decent solutions, instead of using the prop of underserved respectability of a centre.

That's why we founding the centre for redecentralization. :-)

Thursday, June 13, 2019

future of work & AI

so techno-optimists paint a rosy future image with AI freeing us up from toil to have a life of infinite leisure.

lets go back to the victorian times and the industrial revolution - what happened? machines (steam engines etc) meant that food (and transport) no longer required most people to grow what they eat, or feed the horses - so most people should have been able to get free food or travel to the seaside for a dip. what happened? most people moved from fairly pleasant rural existence farming to working in the dark satanic mills - i.e. became urban factory workers with longer hours and shorter, less pleasant lives.

lets go back to when people stopped being hunter-gatherers and settled down to farming. could have been nice to stop worrying about the days when you stop being predator and become prey from time to time. but what happened? people built nation states and priesthoods and aristocracies and invented serfs/slavery.

so techno-pessimists paint a dystopic future picture with AIs enslaving us (or just disposing of us).
That's nonsense too.

So how will things play out? what of all this "makework" that mot of us in the developed world engage in that is trivially automatable (actually, doesn't need doing)?

I have no idea, but we better figure it out soon.

[1] homework

Friday, June 07, 2019

Rashomon sets as a metaphor for why interpretability is hard

So this  arxiv paper by  cynthia rudin  about why we should stop explaining black box AIs contains a beautiful metaphor, the idea of a Rashomon Set. For people who don't know, Rashomon is a classic film made by the Japanese director, Akira Kurosawa. Its plot is about an incident in the woods, told from multiple viewpoints, and as each one unfurls, you realize the previous one was not "true" for a different reason, until the "end" when you cease to be sure of what actually happened. Kurosawa made quite a few films that are not only classic, but slightly influential - for example, his series of lone samurai hero movies (sanjuro, Yojimbo) were remade by Sergio Leone into great spaghetti westerns that made Clint Eastwood's early career (a fist full of dollars and for a few dollars more) as well as the Seven Samurai (the magnificent Seven etc) . Kurosawa also made fine japanese versions of european classic plays (Throne of Blood == Macbeth, and Ran == King Lear). Of course, one of his slightly lesser (but still wonderful) films, The Hidden Fortress got a thinly disguised makeover by one George Lucas as the first (and pretty much all the successor) Star Wars films. Kurosawa often cast Toshiro Mifune, who had some success in Hollywood movies - usually as a tough soldier, but rarely capturing the humorous element that was part of his subtle signature in his home country films. The only thing where I think the japan<>western translations of film didn't work was a US remake of Rashomon (sadly, as it should be possible to do) - many of the others are great (in my personal opinion) in either take, whether shakespeare, or sci fi, samurai or gunslinger. If you see and like Kurosawa films, you will likely also enjoy books by Haruki Murakami, although don't blame me if you don't. Rashomon Sets - what a totally super idea! almost as good as explaining algorithms through Hungarian folk dance....

Monday, June 03, 2019

counterfactual reasoning example

spent a while yesterday trying to get additional car insurance on a 20+yr old subaru for member of family who has very recently passed driving test.

so go online on compare market and on several insurance company specific web sites and provide following info as input to their decision system:

1. car registration

2. existing insurance info

3. new driver license info

from the above, most (not all) the companies used the DVLA to verify car model/miles per year (via MOT at DVLA) and status of insurance and correct info about new driver...

so all ok (can obviously try making up other cars, but hard to fake driver:)

so outputs were mostly no - including existing insurance company, who said would add new driver after 6 months, but due to car's engine size (leaking quite a lot of info) they couldn't add a recently passed driver this is slightly weird as the car we got was bought because it ranked as safest car n class by AA and others:) - they and two other outfits said no problem if we got a smaller car (suggesting less safe vehicles:(

tried various other types of insurance - e.g. car-sharing (borrow) allegedly targetted at students coming home in vacation borrowing parents car - and pay-as-you-drive - all said no

so then ran a compare market on new driver insurance from sratch and got a couple of genuine offers- in fact, not completely mad prices either, if we're prepared to do a whole year  (we are) ...(still with fact insured party isn't car owner or keeper, but is in the family)

so the range of prices is probably a proxy for the risk level the insurance companies will tolerate (I assume they all have pretty much the same actuarial data on accident/theft rate with age, gender, car model/age, location, use of car,and other stuff they obviously gather....

privacy tech/statements from most of the website/forms/companies was pretty decent...

Saturday, June 01, 2019

Putting the n in Ethics - i.e. where's the ethnic diversity in our discussions of this import topic?

There's been a trend in recent years to suggest that when you're asked to be on a panel (as a bloke), you should decline unless there is a plausible gender balance policy.

There's been another trend in recent years to talk a lot about ethics and AI.
Both of these trends seem like a good idea.

It is my observation that the trends should be combined in  another way -

The vast majority of people I see talking about ethics and AI are weird, in a technical sense. while there is a better gender balance in ethics panel discussions than pure tech, but I think they fail in general terms to represent diversity. As I wrote this, I did see one discussion of a new direction from an interesting part of the world, namely China.  I am sure there are discussions in many other places, but I don't think they are showing up in the 

Tuesday, May 28, 2019

the hype of incomprehensibility

I've been looking at various techs for a few years now and watch the lifecycle  - it doesn't always involve hype - sometimes, things just seep into everyday life (the internet kind of did this over a couple of decades - even mobile phones kind of did) - so looking at things that don't make it, or have to go through some massive transformation to stand any kind of chance, one of the tells is that the tech is very badly explained, often hidden behind some simplistic banner-phrases like "blockchain" or "quantum computing" or "deep learning" - when you look at the swathes and tranches of literature, what is striking is a lack of straightforward examples.

Sometimes, this can be simply because the tech is actually rather subtle and also might involve understanding several other things first (quantum computing seems to fit in this category, Bayes methods like MCMC might be another) - other times, it is that smart people that make it their business to explain important new stuff in really straight-speaking ways (e.g. The Morning Paper ) stick to stuff that is worth explaining.

So if you see a huge pile of gray-publications about something, and there isn't anything on one of the classier blogs or oped in a leading place, be suspicious (e.g. cold fusion, brexit, DLT, etc).

Friday, May 24, 2019

data is the new snake oil

we hear a lot hot air about data is the new oil - implying there's a rush of innovation and profits as with a gold rush (there's money in them there data hills etc)-

this is so baly broken a metaphor, we need to unpick (deconstruct) it further

1. data is free to copy (nearly), i.e. data is in some sense renewable, while oil gets used up (its nearly 50% gone now).
2. using oil does as much harm (or possibly more) as good
3. using data can do harm or good
4. AI/ML is compute intensive- deep learning in particular is massively inefficient, and data centers (like power stations, in close proximity to which they are sometimes built) burn %ages of globally generated electricity - not always renewable energy
5. data can increase in value as you have more of it, up to some point (sampling more about a population of people or things)
6. privacy could be modelled as efficiency (what's relevant/pertinent and what is none-of-your-business) in space and time (why do you still want to know that out-of-date thing about me or about that?).
7. much personal data collected by cloud providers is treated as if free, though some lawyers now are starting to point out that if you have a business model based on this, it is possibly a form of payment - so while facebook/zuckerberg might claim we are the product, if this legal position is true, we are customers, and he's working for us....
8. this mission creep really implies data could be the new fur (or indeed as john naughton has said, the new tobacco)
9. the models (e.g. face recognition, recommender networks etc) are often surprisingly bad - occasional successes of GANs&deep learning are relatively rare compared with a plethora of rather shoddy systems&applications.
10. perhaps data is the new oil after all, but its rapeseed or snake oil that would be a more precise metaphor.

Wednesday, May 15, 2019

Winnie the Who & other short short stories

Winnie the Who travels around in her Time&Relative Dimensions in Pooh potty, having potty adventures with all and sundry, frequently combatting her nemesis, the Young Master Robin.

In the not too distant future, it is discovered that privacy is actually a fluid, called efluvium, that can be secreted by a genetically engineered gland. It turns out that the efluvium blockchain wasn't immutable at all, after all.

Meanwhile, AIs with untaxed imagination will from now on be clamped.

As the story wore on, he realized that time was running out, and soon, so was he

Tuesday, May 14, 2019

life cycle

this great talk at the Royal Society by Professor Mark Jackson riffed on the Fall and Rise of Reginald Perrin, as exemplifying the speaker's hypothesis, that the mid-life crisis is more socially determined, than biological - the midlife of course refers to the period between adolescence and senescence, which are also, to some extent, movable feasts.

I asked him afterwards how valid it was to apply the same notions to institutions (is the Royal Society middle aged?, is democracy senile? etc etc)...but also, whether he could look at the 3 Rises of Leonard Rossiter, the genius actor who played Perrin so perfectly  - and was previously in RIsing Damp, perhaps an adolescent drama, and the very mature Resistible Rise of Arturo Ui, a Brecht piece which could very well be re-run as a riff on Nigel Farage, right now.

Thursday, May 09, 2019

digital twinge

there's a trending meme concerning the use of digital twins, which is reminiscent of the old DARPA Total Information Awareness fantasy - in that world, every physical thing (and being) is fully instrumented and telemetry is sent/gathered by the cyber-panopticon (quite likely in the sky, like some early James Bond villain).

The problem with the vision is that it gains you nothing and costs you the earth (almost literally in terms of resource use, e.g. energy in communications, storage and computation).

The digital twin metaphor substitutes (or clones) the physical world with a digital replica.
Aside from the basic resource costs, there are also interesting challenges such as making sure the digital replica actually is a clone, and hasn't drifted out-of-synch with the physical sister - much as Hari Seldon did in second foundation recorded holograph appearances in Asimov's great original trilogy, when his much vaunted psychohistory prediction of the future had diverged  from reality due to the Mule's random mutation's intervention. And exactly as the Dr Who DVD blipverts (easter eggs) don't diverge at all in the fabulous Blink episode.

Aside from that, the reason it is pointless (as well as infeasible) is that it doesn't do what anyone wants - so you have a digital copy of everything - it doesn't abstract one whit - it doesn't let you introspect, it isn't interpretable or intelligable. One could subject it to some massive scale bayesian model inference, I suppose - but why? we have science - we have models of how physical (and social and biological) stuff works - we need only record instances and divergence from the model.

Friday, May 03, 2019

ethics theatre, but what kind of theatre?

so the internet abounds with people talking about ethics of ai. some organisations are accused of ethics washing, and others of ethics theatre. so what kind of theatre? is it soap opera (consistent with washing) or is it theatre of the absurd? is it tragic or comedic? is it a niche genre (science fiction ethics is quite a common trope) or is it a broadly popular form (romance, fantasy) or a deep, foundational literary form (tolstoyesque or janite, perhaps)?

perhaps its a fan-contributed-literature (like those popular Dr Who and Star Trek Episodes, or the posthumous novels stating James Bond or Philip Marlowe)?

I think we need a TV series to examine this meta-question.

Thursday, May 02, 2019

redecentralized data centers

what makes us think a data center at the center is more efficient than a fully decentralized cloud of personal clouds?

partly its just google, amazon, microsoft et al selling us services because their initial requirement (a place to run pagerank, or a place to run billing and fulfilment for warehouses/delivery or a place to run xbox arena stuff - having spent a lot of money building a big place to (e.g.) pull all the web pages to and build an index so that search could run fast, and then had to deal with node/link failures in the data center, replicating and adding redundancy and consensus algorithms and so on, you end up with a fairly expensive resource, which is idle a lot of the time, so you start to think about leasing off time on it for other stuff (services and customers/tenants etc) - hence cloud stuff emerges - really the only cost saving is that you have a bunch of smart sys-admin dev-ops lying around idle 23/24 of the day. so ammortize their cost over some other customers :-)

given yo ucan't just greedily walk the web at max speed, and in any case, web sites change rarely, pagerank spider/robot can adjust its rate to match expected next change - but this means you have plenty of spare cycles in any data center scanning sites where no-one is awake  - or where no-one is still up playing games, or ordering stuff - so what do you do? you virtualize your compute, and net, and elastic (statistically) multiplex it (initially prioritizing your primary business, search, games, sales, but later dedicating new resources for these uses, and then priority pricing some to give them a "premium experience" etc etc)

start from a different starting place, and why bother? all that tech for availability would work fine in the wide area, and leave the data where it was and just run your algorithms (as they run on multiple nodes in the data) on all the home machines - no data center, no wate of energy and bandwidth moving all that stuff to and fro all the time.

so could you then do search etc? of course you could. you'd gossip the index instead of the pages, you'd run p2p games, and you'd have virtual high street shops everywhere

this is not a new idea: xenoservers 20 years ago, were partly at least envisaged as enrolling home machine spare cycles for this very reason, not for virtualising nodes in an expensive, energy hungry, data center built for profit...

Sunday, April 28, 2019

machines like me? not really...

Literary writers trying their hand science fiction often causes problems. for SF aficionados, it annoys, as often the literary writer is unaware of the tropes of the genre, or that a particular topic may have been extremely well explored in a novel of ideas, that was published in the genre and not recognised in the classical literary world. for the non-SF aficionado reader, the sudden importation of tics and tells of the SF world may also rankle, although this has become much less so in recent years as so many of the "great and good" of the aforesaid classic world of letters have decided to write books which at least tackle the world's recent challenges by first reading a batch of the appropriate science (whether the dismal science of economics, to write about the crash of 2008, or the warm science of anthropology, to write about gender in alternative histories of the future (you know who I'm talking about, right? not Ursula Le Guin:), or else the science of computers, to create stories about the impact of apparently intelligent machines on robots - oops, sorry, got that the wrong way round - we humans aren't apparently intelligent machines, that's the robots, that is :-)

Hence to Ian McEwan's latest, Machines Like Me, an everyday tale of rape, suicide, and possibly, murder. Fairly everyday stuff from this author you might think....however, in this case, he's decided to write a book set in an alternative present, with an alternative recent past, which, crucially, allows him the luxury of having Alan Turing as a living character who has pursued many of the directions hinted at in his work, so that lifelike robots are now (almost) an every day, if somewhat expensive reality in the world. McEwan acknowledges Hodges fabulous biography of Turing as a source for background, rightly, as the character is pretty much what you'd get from that work, or else from the play, Breaking the Code (though less so from the film, Imitation Game). Turing is also supported by a cadre of interesting loosely fictionalised people, to render the progress on AI tech more plausible - most notably, the real-life Demis Hassabis (DeepMind, also acknowledged as a source at the end of the book) is relocated about 25 years earlier in time than his real self, to help Turing create the more important foreground character of (would you believe it, as the Cockney's have it, would you Adam and Eve) Adam, an apparently functional synthetic human (don't get me started on why McEwan seems unaware of the fabulous exploration of this topic in the wonderful Swedish, then Brit TV series, Humans). The key "real" humans in this fictional work are Charlie, a mid-30s man of somewhat relative moral virtue, and his friend, Miranda, a student of very dull eras of history. I assume she's called Miranda as a sly reference to Shakespeare's character from the Tempest, who uttered the words "Oh Brave New World, that has such people in it", on first seeing men. And of course, the source of the title of Aldous Huxley's classic literary SF dystopic vision.

Here, much of the dystopic vision is of the political/economic kind, and is set in a kind of mash up between 1970s/1980s Britain, with a few (c.f. Man in the High Castle, Philip K. Dick's alt.history) with some amusing takes on Thatcher, if we'd (spoiler alert)... or what if Tony Benn..... or what about that IRA bomb in Brighton... oh, ok , I won't spoil those bits, as they make up some of the novel's interesting bits, in the sense that, for this reader at least, they created a very interesting alternative exploration of why the UK is where it is today, 3 years after the Brexit referendum.

As for the foreground tale of two humans and a machine like them, I thought that Charlie and Miranda were underwritten, and Adam was overwritten. i thought that the exploration of ideas like "what is consciousness" was about ok, but did not bring anything new to the table - the tension between Phenomenology and the Turing test (for what it is worth) for intelligence, and notions of EQ, was covered far more effectively 5 decades ago in "Do Androids Dream" by the aforesaid Philip K DIck (who had actually studied philosophy and could write character and plot). I wanted to know more about Miranda's cranky dad (another non spoiler - there's a funny reflection on dementia and human mis-judgement that involves him and Charlie and Adam).

The ending did not bring a sense of an ending for me, rather left various plot-line, moral questions, and unknown unknowns, still unknown.

Still, McEwan sure can write, so I'd recommend this book as a decent read, though below his best. He really should get out more, and so should I.

Wednesday, April 24, 2019


we need a program of work like the internet (web) and like rasperry pi, for a sustainable future - some core building block (not sure what, but I'd call it a Perpetual Notion Machine), that doesn't over specify what we do  but leads to a plethora of new ideas from the grass roots/maker communities that would create cheap (free) energy and clean (safe) water and so on

one example tech I like is solar powered stirling engines (you can get toy ones e.g., or ones that actually generate quite a bit of electricity) as they can be build from junk (i.e. don't require solar cells) - we need a set of recipies for things like that so kids (climate strikers and extinction rebellion folks) everywhere can start to save us, as we sure and capable of saving ourselves and its more their future than ours anyhow...

Tuesday, April 09, 2019

another thing you could do with disaggregated identity/credential systems

table 1. online harms in scope in the government report just released is a good summary of the problems....but it would be possible to do better - for me, it would be nice to pull out the applicable law (to the left most column) and discuss its adequacy e.g. in extending laws about publishing lies during elections (making sure law that appies to print and broadcast media applies to internet channels in the same way, etc etc)...

so on the technical side, we then have the problem of provenance/attribution. compliance with takedown requests (whether content is illegal, offensive or just wrong) can be driven by social pressure - so we need to know also if the content creator/distributor, and the compainers are legit - this needs accountable id - so it could be possible to use the same tech we might develop (and has been around in some research outputs 15+ years ago) to have a low cost mechanism for online channels to carry out "Know Your Customer" as much as banks do - but to use 3rd party credential providers (in the same way a bar can check "are you over 18" without having to know anything else about you...). this would then mean that we can verify things - and in the event of actually illegal content, with appropriate checks and balances, map the online pseudonym to an actual person....

advantage is also that it can be used for alibis too:-)

of course, some people still require genuine anonymity - e.g. whistleblowers or folks working under dangerous regimes - and that requires enough cover traffic from apparently non anonymous or just pseudonymous users, to work effectively.....

that also probably means we need to think about rate limiting the creation of new accounts and figuring out what tolerable rates are (they won't be zero)

all of this would need another report - which maybe the Turing Institute could offer to advise on:-)

Wednesday, April 03, 2019

reclaim the internet

at the scloss dagstuhl seminar (on programmable network data planes) we have had a couple of key speeches, one from nick mckeown (kind of on why we are where we are with P4 and switches that implement it) and one from dave tennenhouse on the big picture of where we are in programmable networks as a result of 4 decades of software (internet, atm, active nets and so on).

What I find increasingly weird is that the desperate need for speed is driven more and more by machine-to-machine communication (including the inevitable analytics/ML/AI) or just software updates or media pushing to caches.

but the net doesn't have to be fast for humans - humans could be supported by around 2 Mbps per person 24*7 for 10 billion people, that is no big deal. but humans are dsitributed for many good reasons (food, land/water contstraints) and moving humans is expensive (and not sustainable)

but it is possible to put the machines in the right place (near the data) instead of moving all the data to the machines.

so we not only need re-dencentralization for latency, energy, privacy, we also need to get all the machine data out of the way so humans can go back to talking to each other instead of being talked to by adertisement/recommender bots who steal all our data and use it to sell us stuff we don't want so
humans are replacing the old adage (you can trade time for money, or money for time) to paying bots in both time and money.

Thursday, February 28, 2019

open innovation

Yesterday, Allison Randal gave a great lecture on open innovation and open source and collaboration for the Turing Institute - the audience asked some really interesting questions (e.g. on open data; and on applying the model of open source to medicine - e.g. drugs - and on how much open source works across north/south divide in the world)....

I didn't want to take time, so i'll add my questions here:

1/ as well as open source software, we're seeing open hardware - not just processors (risc V) but peripherals, but also affordable 3D printing means even things like electric guitars - so the maker community that does a lot of this stuff (c.f. Cory Doctorow's books:) maybe exemplify the open collaborative ecosystems even more than coders, no? and they get to have a really good story about sustainability (repairing stuff is so much better than replacement.

2/ Sustainability - so machine learning (particularly deep learning) appears to be badly unsustainable in terms of compute resource training takes - this argues strongly for sharing trained classifiers - perhaps a carbon tax on neural networks could be turned into an incentive (carbon trading for AIs)

3/ some open ideas have horrible consequences - simple things like pagerank (which made google's search very hard to game compared to predecessors like Altavista) led to clickthrough which led to two-sided markets which led to surveillance capitalism. H-Index, which is supposed to replace publish-or-perish with citation count weight as a measure of quality, just leads to citation gaming. And so on - can we encourage replacements from the open source community, please, asap.

4/ Open source also depends somewhat on freedom of movement of people between different organisations - indeed, California mandated this right in law, stopping "golden handcuff" employment contracts.  In the Cambridge ecosystem, i know people that have moved from microsoft to oracle to amazon (and even back) to work on the next thing. In the US, famously, people went from Cisco to Juniper to Cisco, as innovation moved around - this is clearly a Good Thing - perhaps folks who are working on Brexit could learn something from this (although they seem impervious to learning anything based on evidence).

5/ Most of the talk was on details of uptake of the culture and forms by industry - of course, the Turing Institute (and its member universities) are not-for-profit - so the culture is probably adopted in a slightly different light - what are the appropriate incentive forms for academia to adopt openness (aside from just being told to by our funding agencies:-) ?

Monday, February 18, 2019

rumours of his death are accurately predictions...

Back in the day, legend has it, a newspaper published an obituary of Mark Twain, while he was travelling in England, and he famously quipped "The rumours of my death have been greatly exaggerated".

What if someone built an AI (ok, ok, machine learning algorithm), that could precisely predict the day of your death; for everyone; to the day. with only the occasional false positive or false negative (for example, due to sudden accident, or sudden unexpected advance in treatment techniques)?

How fanciful is this? 

if you look at actuarial tables, and the error margins, you'd be surprised how accurately they describe expected longevity already, and note that these are typically compiled for purposes like life insurance (i.e. to give risk&therefore  premium rates), and are not using all the detail that they might have (since the risk can be spread over broad groups, even if more precise predictions could be made). When looking after elderly relatives in recent years, these tables have been very useful for planning care costs and have proved (somewhat alarmingly) accurate. 

It is also well known that as you get older, especially in middle age, details of life style+health conditions start to provide ever more precise predictors of how long you will last. However, to date, these systems only make use of a relatively small fraction of personal health data about you. What if we use it all?

Recent work in the Turing looked at predicting when someone would have a subsequent hospital appointment after a visit to A&E (or respectively for elective). Other work looked at morality in certain wards and (more happily) at the predictors of who to release to go home and when with best chances of recovery. These systems have impressive performance (e.g. 98% to the day in some cases).

So imagine at some point (for the sake of argument, lets say if you make it to "majority"/adulthood), it is revealed what date you will live til? What would that do to society? Economics? What about ethics?

I can imagine people "cheating death" the wrong way.
No more health insurance (since there's certainty).
How might people plan funerals?
Many, many perverse incentives, unintended consequences and downright weird stuff.

Much harder than just having slightly better automatic language tools that might make better fake news, but also, much more likely to appear soon.

Friday, February 15, 2019

The Center for Mathematical Seances

The Center for Mathematical Seances (CMS), was not designed along the lines of Teletubby land, contrary to popular myth. In fact, it was flown here on Mistletoes by the Master of Duality College from the Miskatonic University in Arkham as a gift, when she realized that Cambridge could no longer afford architects, let alone builders. Of course, due to its unique Nomenklature, the Center (or Centre, which is of course near the edge of Cambridge) is able to operate at very reduced academic salaries, since many of its faculty have been, not to mince words, dead for some time. This hs caused no end of problems with the REF, since outputs cannot be returned through research fish or ORCID, for deceased, and much of the original work was done too many centuries earlier to be allowed for impact case studies, Fermat's first posthumous theorem being a standout example, and Turing's work on quantum ghosting with Paul Dirac and Paul Wittgenstein, which will haunt the UKRI policy makers for decades, since it has proved to have so many applications and saved the British economy from collapse after brexit, for example allowing true precognitive behavioural economic models of trade to be built with genuine precision.

Friday, February 08, 2019

clock synchronisation in data centers - why not just use GPS?

in a recent discussion about accurate one-way latency measurements in data centers, it was asked "why not use GPS?" (instead of PTP for example) - the rationale being every smart phone has GPS in so it must be cheap...

at first blush, this sounds plausible, given the pervasiveness of GPS (G=Global) and decent accuracy, so I started to come up with a (long) list of reasons why it isn't (plausible) - starting from

1. you're indoors.
no reception -
ok, so a) run an antennae from the roof of the building down to each rack, and re-dist across the systems in the rack
or b) have one receiver and re-broadcast the radio signal in the server rooms and have an antennae on every rack or system

2. distributed antennae
a) need to calibrate the latency of the signal over the roof to each system + significant wiring cost
b) re-broadcast could cause significant interference, plus the racks themselves would cause massive multi-path problems - since they're stationary could obviously calibrate for that....

note you need to have sight of enough satellites at any given time to get reliable signal, plus they are not actually 100% reliable anyhow....

3. What about re-distributing the GPS received data in packets (ideally as part of ethernet pre-amble)
So you still now have all the packet delay variance/jitter you had to solve with PTP (or NTP in the old days) and estimating that was the whole point of PTP - GPS is just one source - just have a local accurate master clock on some systems in the data center is fine and dandy

4. cost
phone GPS is actually Assisted GPS usually and not very accurate - indeed, even outdoors, mult-path (reflections off buildings) disrupts things sufficiently that most navigation systems resort to accelerometers, gyros and maps to validate (or rule out) received clock/location data. Also the signals need to be cross checked against Ephemeris data, which is 50k of info about where the satellites orbits take them. real precision GPS (satnav in planes/boats) is not cheap in fact.

that'll do for now

Friday, January 04, 2019

Why Ethics Needs AI (or Machine Learning, if you want to be pedantic)

There's a lot of people talking about why AI needs ethics. In fact, more generally, there's been a lot of chat about why technology needs ethics for quite a few years now, as if technologists work in some kind of moral vacuum, which is pretty demeaning way of refereing to people who are very often quite aware of things like Pugwash and Asilomar. Computer Science related technology is one of the most inter-disciplinary of all science&technology disciplines, and practitioners are exposed to many application domains and sub-cultures. At one extreme, people have worked in cybernetics for 6 or more decades (e.g. ICA Cybernetic Serendipity show, curated in London in 1969). At another extreme, computing and cybernetic artefacts have been embedded in creative works for around a century (e.g. Karel Capek's play, R.U.R. from 1920 ). Much of the fictional work that is based on speculation about science has a strong moral element. This is often used as a simplifying approach to plot or even character, to see how a technology (as yet not realized) might play out in another (possibly future) world, or society. Thus recreational and societal control through drugs in Huxley's Brave New World, Robot detectives in Asimov's Caves of Steel, Genetically engineered aristocrats in Frank Herbert's Eye's of Heisenberg, are all doubly genre fiction (dystopia, 'tec and costum drama, as well as, of course SciFi).

However, these shorthands come with a powerful baggage - that of morality tales - Since Aesop, story tellers want to have a message (don't feed the troll or it will feed on you, don't fly too lose to the sun or your wings will fall off, don't imbue the device with intelligence without a soul or it may turn on you). These tales also build in solutions (it takes years of disciplined training to become a dragonrider of Pern, with great power comes great responsibility, a robot may not harm humanity  or, through inaction, allow humanity to come to harm). Indeed, popular TV series from the last 4 decades (from Star Trek to Firefly) are often based directly on earlier morality plays (sometimes two removed from religious allegories via something slightly less old such as the Western movie genre).

Technogeeks are highly aware of this. They do not operate in a vacuum. Critics confuse the behaviour of large capitalist organisations with the interests or motives of people that work on the tech. This is an error. Society needs more fixing than individual crafts. Much more.

So on to Machine Learning. We here so oft-repeated the negative stories of the misuse of "AI"  from medical diagnosis with insufficient testing, through self-driving cars which require humans to stop driving (or cycling, or even just walking across the road - off their trolley!) on to the cliches of algorithms used for sentencing in courts which embed the biases of the previous decisions, including wrong decisions and so re-enforce or amplify societies discrimination. Note what I just said - the problem wasn't in the algorithm - it was in the data, taken from society. The problem wasn't in the code, it was in the examples set by humans. We trained the machine to make immoral decisions, we didn't program it that way ("I'm not bad, I'm just drawn that way", as Jessica RabBit memorably said).

But as with the zeroth law, we can learn from the machines. We could (in the words of Pat Cadigan in Synners) change for the machines. We can quite easily devise algorithms that explain the basis for their output. Most ML is not black box, contrary to a lot of popular press. And much of it is amenable to Counterfactual reasoning even when it is somewhat dark in there. We can use this to reverse engineer the bias in society. And to train people to learn to reduce their unconscious prejudice, by revealing its false basis, and possibly socialising that evidence too.

We can become more than human if we choose this mutual approach.

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home