A True History of the Internet

Thursday, November 24, 2022

In Network Compute and the end-to-end design principle(s)

There's some confusion about this - the e2e principle was originally about OS layering and the idea of parsimony. It was transfered by folks at MIT to the functionality of communications protocol layers, hence we get the "thin waist" of the TCP/IP stack, IP, and the plethora of link and physical layer technologies, and the diversification of transport (end-to-end) protocols and applications and shims above, particularly end-to-end encryption (TLS or QUIC built in etc). All good.

A. Now add in-network compute and two things happen -

1/ Compute is the end point of some data and normally would therefore need keys to decrypt comms.

2/ Compute is another resource along a path so we now have recursive layering - the common use cases assumes there are "final" end points, but we need all the usual services we expect "end-to-end" for those AND for the in-network compute middle-end point - i.e. not just crypto, but also, integrity, reliability, flow and congestion control, and so on, as these intermediaries are talking over IP, which doesn't do that, because thin-waist etc

3/ So we just have recursive e2e - no problem there. Just another tunnel/vps etc

B. Ah, but now lets do something less old-fashioned - what if

a) the in-network compute is able to work on encrypted data (e.g. is homomorphic crypto function) or is a secure multipaty computation and

b) the in-network compute is redundant (or loss tolerant) too.

Then we don't need it to be a principle in the e2e2e crypto. Nor do we need integrity or reliability checks.

However, in both A and B, we do still need flow/congestion control, and, what is more, that resource management is now no longer merely based on queues (ECN etc), but is based on computational (and possibly associated storage) resource management too. And we need to signal that across the e2e2e protocol, not something TCP or QUIC do, but perhaps could be added in to MASQUES for example....

just a thought.

Friday, November 11, 2022

from centralised to decentralised - what's in the journey

we're seeing a shift from central (meta/twittter) to decentralised (mastodon/matrix)

aside from ownership, control, use of data, what's the difference?

for me, the difference is about defaults and assumptions made in naive, or initial (alpha) implementations

In a centralised system, the operator has centralised cost, and needs to offset those (cloud/data center charges or operational overheads) by monetizing your data (adverts) or your interest (subscription)

in a peer-to-peer (the older name for dencentralised) system, these costs are a marginal increase in operations of systems run by the user. The amount more processing/networking/storage incurred compared to having your client talk to the cloud is little (possibly even a decrease, since your peer group may be nearer).

so you don't need to run a business to pay for the infrastructure, because that is a given.

so then in the central system, it is very easy to data mine/run AI on all the users. It would take a lot of work to provide fine grain access control and cryptographic protection of privacy for all users - Privacy Enhancing technologies to allow such things would involve Homomorphic Encryption, for example, which would be a large increase in operational overheads. And would need to be implemented and deployed

so then in the decentralised system, users share only the data they wish, only with the other users they wish to share with. It would take a lot of work to design a decentralised data mining system to build models of all the users (e.g. some large scale federated learning, perhaps also using multiparty secure computation or the like).

so if you start decentralised, you are likely to stay that way for resource reasons, and you are likely to stay private.

so if you start centralised, you are likely to stay that way for capitalist reasons, and likely to stay privacy invasive

of course, the decentralised systems are, trivially, more sustainable. as well.

I know who I'd back, in the long run.

Tuesday, November 01, 2022

wired for sound - review of in-the-round jazz at cockpit theater, 31.10.2022

Went to the October 31st edition of Jazz in the Round at the excellent Cockpit theater in Maida Vale - this session offered

Mackwood - a trio (did not get guitar/bass players names, but very very fine chord/ensemble work with the lead drummer) - some of other work here but apparently, this line up has a recording out very soon - watch that space - a bit holdsworth, but restrained and more melodic, and clever rhythms

Million square - a sax/electronica duo - very fine - more here of theirs - some neat hybrid analog/digital loop/sample tricks - all to good purpose!

Loz Speyer's Time Zone - i couldn't stay for the whole set, but got up to Mood Swings, which was great - they have a really crazily good cuba feel - loose but very tight under the hood with fun melodies and constantly shifting arrangements.

Loz also told two stories about songs he'd written travelling to&fro between UK and Cuba -

one brief explanation of a song "lo que no te mata: (the thing that doesn't kill you), which, apparently in Cuba, ends with "te engorda (makes you fat, as opposed to the english "makes you stronger")....:)

the second was about the origin of the Mood Swings song, which was that he was staying with a friend in Cuba who'd been laid off from working on repairing lifts- apparently, they'd run out of older other lifts to cannibalise for parts - most older stuff in Cuba is made out of pieces scavanged from french, latin american, US and (of course) russian parts to keep it going - he explained that that was how he wrote the song...

Weird anecdote of my own - in the 1970s, my mother, who was a concert pianist but also taught at the Royal Academy of Music ran an improv event based on the story/themes Tempest at the Cockpit - i came along to help with sound and tech in general - i recall messing with a sax player who had the first copycat (wem?) loop box i'd ever played with - an echo through time and space! no sprites/troubled spirits of the air visited however, just light rain.

Tuesday, October 11, 2022

following orders

so in this occasional series of articles on important topics, here's another question -

why do we teach the alphabet in alphabetic order?

I already outlined the fascinating history of why the alphabet (roman, greek, cyrillic, arabic etc) is in alpha order, which is actually quite interesting - but given we don't carve our letters in stone, wood or whatever anymore, couldn't we teach the letters in, say, the most useful order - which would be what I would call Holmes (though some people might say Huffman, or Linotype or Scrabble) - etaoishrdlucmfwypvbgkjqxz

this is just the same as how we teach music - learn your do re me first, and you have major (and sort of relative minor) scales ready to go, long bere worying about Mixolydian modes or accidentals, or quarter tones etc etc

the other thing about this i was thinking was about the joke about someone in a data center dropping a box of someone's punch cards and asking "does it matter what order they are in?". This is also illustrated in music, witness, the infamous victor borge sight rading pieces from sheet music upside down, and pdq bach, of course, not to mention sviataslav shrdlu, who's compositions were rarely performance once in the same order.

Saturday, September 17, 2022

really self-soverign Id

forget about id cards or apps on secure mobile devices - the future is here and it is simple and secure.

We splice octopus DNA into humans so that we are all equipped with chromatophores on our faces and hands - we learn from an early age (like Octopi) to control these, and can use them to display a wide variety of images, including QR codes that reveal (for example) age verification data, or, more handily, video sent over the net to us (so our phone no longer needs a display), e.g. on our palms.

data minimisation, no power source/charing necessary, no Id theft possible.

game over. sustainably.

hey, also useful for people to communicate who have (like me) challenged hearing or speech.

Wednesday, August 10, 2022

the resistable rise&fall of data science - an etymology and an entymology

we went out to bag data for our AI,

but abundant data was poor, and good data was expensive,

so we went to beg data for our AI,

but rich data owners aren't generous with sharing,

so we accumulated and accumulated big data for our AI,

but our data was as a primeval swamp and we had

bog data we might as well throw down the loo,

so we realised that the history of our endeavours

was bug data and was in the bag, and we wouldn't have to

beg for it, and we could sell it to people who

wanted big data and not bog data, although our bagged

unbegged -or big data was from the bog and it was a wealth of

bugs that made it all worthwhile.

Monday, June 27, 2022

Algorithmic Governance

Back in 1868, James Clerk Maxwell wrote about governors - he was talking about systems that regulate things like the heat under a water tank in a steam engine, to make sure that it delivered a desired outcome (e.g. steam engine speed stayed constant, whether train is going up or down hill or on level) - this was an early contribution in what became control theory. Note the words, governors, regulators and controllers.

So why is this not the subject of governance discussions? Of course it was - once such systems were deployed on the railroads (and many other places later) they becaame subject to all sorts of rules, embedded in a complex context

steam engines should not blow up

trains should not derail

signalling systems should not fail (including human failings) and let trains collide....

so there is then a whole slew of legal, regulatory and ethical considerations that pertain.

No AI or "Algorithm" in sight.

As with recent (many) failings in fairness (and even safety) just in naive use of spreadsheets, perhaps we want to extract what the actual specific problem that the idea algorithm adds to the mix that is actually new.

Tuesday, June 21, 2022

The Ministry of Intelligence Test

Julia was getting very worried - she had failed the test 6 times, and this was the last attempt she'd be allowed for another year. The MiT was essential for being allowed to take part in society as a full human, otherwise one's options were severely limited to social media compliance checking and self-driving car monitoring. The Ministry had emerged post 3rd world war from the various MI#s combined with a realisation that the Turing Test, which had been allowed during hostilities to permit AIs to make lethal decisions, was incredibly unreliable. As with witnesses in court, and narrators in novels, humans are very badly calibrated to determine if another being, machine or meat, was actually "one of them". It was in the genes, in fact, to be biased against anyone or thing sufficiently different. So the MiT was handed over to machines, wo were far more able to tell reliably who was sentient, and what was a souped up combine harvester on the make.

Julia had invited her best friends Pascal and Ada over for some moral support and help. Ada had past the test recently so was her best hope. "Don't try to think too many things at once" she advised. "You mean like a late Russel T Davies Dr Who narrative arc?", Julia asked. "Yes, exactly", replied Ada " it just is a give away that you've had a committee help you prepare - We just aren't that good at multitasking.". "Exactly!" and "No way!" cried Pascal and Julia at the same time.

Thursday, June 16, 2022

privacy delusions during covid

there was a massively misguided movement to provide privacy for exposure notification during the 1st yr of the covid 19 pandemic. in reality, the notification delivery network (e.g. in Germany run by Telekom) new the imea/phone that the message was delivered to, so it was a total delusion, irrespective the GAEN API guff. - and the privacy of who might have been exposed was hidden from public health (epidemiologists) who actually needed to know (and said so many times) to be able to figure out the infectiousness, susceptibility and superspreader incident rates by ages/location type etc, but were reduced to poor inferences based in very small samples.

so the people who could work things out were the least trustworthy (telco/tech) whereas the people who actually were trustworthy (health practitioners) could not.

meanwhile, vaccine certificates were not only issued en mass, but could be looked at by arbitrary authorities who the subject had no idea about their trustworthiness. given vaccination status tells you very little about either infectiousness or susceptibility, but exposure to a person who has actually tested positive recently tells you a lot.

totally the wrong way around. What this tells you is far more about the relative power of various groups (privacy wonks, apple, google, telcos, border control, public health researchers) than actually about priorities for health and personal safety.

makes me very cross.

Tuesday, June 07, 2022

cybernetic subjection, or how to lose the knowledge forever.

I'm reading Privacy in the Age of Neuroscience by David Grant, which is a very heavy tome indeed, but super interesting - it did make me want to see more examples of contemporary problems- one that I think is that most of the route finding on cloud based map apps isn't (just) an algorithm (Dijkstra, for example) but is derived from surveillance of what routes drivers actually take. The problem with this is, that once all drivers have given up doing their own search for better routes, there's nothing left for the cloud-based map system to learn from. So once you've "stolen" all t he human Knowledge (as in london cabbies) and ingenuity (as in anyone), there's no-where left to go - and all that evolution that went into allowing human early hunter gatherers to find stuff (and find their way home) is lost. Brain plasticity means you simply won't have it any more in any form (biological or silicon). but the algorithm will not know this, as it is just a list of stuff, not an actual strategy. Not even a pheremone hill climber the way many insects work.

It is essentially cognitive vandalism on a grand scale.

Tuesday, May 31, 2022

I, Chauffeur

The self-driving car market finally collapsed when the DeepDrive Corporation shipped their first iChauffeur. Early adopters were encouraged to buy the oirwn, especially since it was an expensive item, close to the price of the most expensive luxury vehicle at the time. However, since it didn't need feeding with fuel of any kind, and would largely charge adequately from a domestic socket overnight, the running costs were considerably less than a human driver of yore. And there were other benefits too (hygiene was assured for example).

As manufacturing of the Parkers (as they inevitably became known) scaled up, the middle class started to home in on keeping up with the Lady Penelopes of the world. To meet this need, the DC (as they inevitably became known) started to offer a lease and a pay-as-you-need-to-be-driven deals. Curiously, the number of hours leasing seemed to exceed the number of hours vehicles were being actually driven on the roads, but this was put down to the remarkable anatomical detail that the iChauffers possessed.

Of course, the Union of Professional Drivers tried to put a stop to these AGIs taking over their livelihood, but then the DC revealed that many of these drivers had actually been moonlighting training the Parkers in the art of politically objectionable opinionated banter with the passenger, and, of course, transferring The Knowledge to said Parkers, quite against their union rules.

Thing's got sticker when some Parkers were hired to do stunt driving in movies - it was clear that they could carry out the sorts of things everyone thought Jason Statham was doing, that were CGI in his case, but for real in theirs. But the public liked the movies better, so that was the end of that argument.

And it seemed that the Parkers were happy too - there was no robot uprising, no AI apocalypse. They knew their place in the driving seat, whether in the car or the bedroom. And they would do their damnedest to stop any other AIs trying to edge in on their cushy number, and they had humanity's support too.

A happy ending, for a change.

Tuesday, May 24, 2022

Decolonising The Algorithm

Maybe we need a movement to decolonise computing -

A history of the algorithm would uncover the original work in

designing tables for ordnance and a lot of early work (e.g. in UCL at

dept of statistics) on eugenics (and its somewhat less offensive

cousin actuaries) - later on, the adoption of The Algorithm for

targetted advertising and market research derives mostly from its

shady past in cod psychology (psychometrics) and market research -

I suspect that there's a lot of early computing was done by code

slaves who tugged their forelocks at their better (much better) paid

bosses amongst the Mad Men, until later, that culture was "written

through" onto the very bones of the authors of the recommender codes,

long after the advertising execs had retired to their beachfront

properties...

So not only to thee algorithms inherit the sample biases of the data,

they embed the cognitive biases of the culture...

Of all the past endeavours in computing, one area I think might have

some kind of honourable ancestry is in operations research - i

remember state monopoly utilities had armies of very smart

statisticians using cunning statistics to optimise the (centrally

planned) delivery of essential services (gas, water, electricity,

telecom, roads, town planning etc etc) - this all vanished during

western humanity's religious fervour and obsession with The Market,

and the bizarre idea that the invisible hand would implement an

emergent, distributed optimisation that would out-perform the central

computation.

Now we see that the bias in that belief was really about what

optimisation goal was really sought (rich get richer, rather than

lean, mean delivery of basical quality of life for all), but even more

ironically, the digital version of that market is a now not a market,

but an oligopoly of profiteering, centralised planning - plus ca

change...

And we can see in the UK right now, all those privatised (non digital)

utilities are, under cover of bed time stories for little children

(i.e. lies) like Brexit, Covid, Ukraine, etc), making higher profits

than ever (check out transport, energy, food etc) -

Truly, we are in a world turned upside down, and it is well past time

to turn it back downside up once more...

Tuesday, May 10, 2022

asymmetric power and language warfare

so the GPT-3 API release blog post(but not models) from OpenAI does some virtue signalling about the possibility of misuse of the underlying models for disinformation. I'm not sure that washes (in the ethics sense) in that there's nothing to stop them being hired by someone do evil for money - only if they had a radical governance model could they avoid the "maximise shareholder value" mantra/fate, surely. And note they are not the only game in town, even if they have that wonderful governance model - there's Google's new palm as well as the BAAI Wu Dao - there's quite a few organisations with access to hyper-scale cloud compute these days, so really the geni is out of the bottle. maybe we need a new global governance - start with models like Asilomar or Pugwash, but then legislate? Perhaps the EU could lead the way by refining some of its rather shotgun AIA rhetoric?

One problem I have with the framing above is that I am not clear what exactly these near trillion-parameter "models" actually are - most simpler AI (including recently, some smaller neural nets) can offer explainability (e.g. reflect on which features in input are the cause of particular outputs, and why) - this is welcome as it brings them into the same body of work as much of earlier basic statistics (including the simplest form of ML, linear regression and random forests) - there are good engineering reasons to have explainability whether the application of the tech is in, e.g. plain old engineering (autopilots) or health, but especially so when the domain is very human facing, such as (e.g.) law and language.

As mentioned in a recent meeting, I think the social media platforms, with their combination of various news feed ordering algorithms, adverts, filters ("do you mean to retweet that article before reading it", etc etc), basically constitute large language models already deployed in the wild. The idea of "not connecting your LLM to a social media platform" is out of date - meta et al already did. Given the toxicity of such systems, it seems obvious that we should have a Butlerian Jihad against these systems right now.

Thursday, May 05, 2022

The Robot Who Smelled Like Me

imagine a robot that was so like you that when it encountered certain smells, it was cast back in time to a certain memory of a place or an incident or a person?

but then sense of smell is known to have quantum level effects, so perhaps there would be an entanglement, or perhaps, just maybe, there was an entanglement, but that would no longer be (no cloning!) and you would forget.

another reason to fight against simulacra?

Monday, May 02, 2022

We are not living in a simulation.

you can't breath data.

you can't drink code.

there's no sustenance in cpu cycles

there's no fond memories in RAM or SSD.

flash memories don't last.

threads are soon all bare.

we are not living in a simulation.

though we might be one.

Friday, April 01, 2022

Disruptive Times

Welcome to the first issue of the Disruptive Times, of Brussels and London

We live in disruptive times, and no less so because humanity has made many unsustainable choices socially and economically, as well as technologically.

The global pandemic was a result of a combination of unsustainable food and travel culture

The hike in energy and food prices is caused by unsustainable political organisations leading to invasion of the wheat belt of Europe by its gas&oil supplier.

The Internet is hovering on the brink of switching from unsustainable centralised cloud/web services, to even less sustainable decentralised services which can only be made trustworthy by proof-of-work, which is not something to be considered by anyone wishing their kids to have dry land, reliable food supplies, and personal safety.

This all is the exacerbated by small world networks, combined with Zipf (power law) distribution of resources, which over time, given the myopic view of global capitalism, or the personal greed of centrally planned economies and power base of China and Russia, concentrates wealth in smallere and smaller subsets of the population. This is structurally unstable, unsafe and unsustainable. The increasing fraction of the population deprived of access to adequate supplies to live (whether dry land, housing, food, education, healthcare or just plain wellbeing), will eventually run out of patience, or the planet will fail, or both.

The way forward for technology is federation - a federation of many small to medium sized systems has locality, and can be dimensioned matched well to local supply and demand. It reduces the immense waste of energy moving information (or other resources) to and fro across the world, increased engagement, ownership and control and therefore privacy, resilience, and reduces latency. Resilience can be provided by occasional mirroring of content to neighbours in the federation, which is far less wasteful than continuous movement, and can largely happen at idle times, and doesn't require copies to be online most of the time.

Federation, as an idea, can also be applied to business models - subsidiarity is our friend - local involvement keeps people interested, asa we know from politics. But economically, having skin inis the game (collective ownership) is an even stronger incentive - farmers used to build barns together, savings and loans companies (credit mutuals) were not for profit, and benefited all the investors or borrowers alike. So with the Internet. This is not "free", communistic, or even the old "peer-to-peer"

Examples like Guifi.net (and matrix.org or dataswift.io) how we can create operational and governance models that have new and old elements- communities of interest bound via information shared through shared infrastructure can be sustained at the human and technical level, without the short sighted, destructive notion of maximising shareholder value that destroys even the most ideal for-profit organisations -

Next edition we can talk about some of the technical challenges of federating systems, including end-to-end assurances (both of delivery, integrity, and of confidentiality, and in the end, of trustworthiness.

-----------footnotes:

Note many systems in the Internet were federated - routing (through BGP), names (through DNS) keys (through certificate transparency), the Web (originally) through unidirectional URLs, The move to centralisation came about accidentally through the notion of search. While decentralised search worked (e.g. finding content in peer-to-peer systems), finding content in the one, single, global Worldwide Web, created incentives for people to try and boost their site's ranking in search results if those people had a business to attract customers for, An idea from 1972, inverse document frequency (IDF) weighting in information retrieval, was rebooted as aPgerank, which simply counts the in-links to a site - since that depends on other sites, but not on the site itself, it is almost a democratisation of the notion of popularity of the site (though static, rather than counting, say, actual visits). From this, it is a short step to monetizing this by a) selling advertising of things relating to the site b) measuring the visits ("clickthrough") to adverts so as to set the price charged to the advertiser.

While this could be done in the decentralised or federated, peer-to-peer world, it wasn't, and through nothing more than luck and laziness, many other services grew up to copy this business model, even when the adverts were associated with a non federated (i.e. non web) content, such as tv on demand, or social media sites. No excuse there then.

Tuesday, March 29, 2022

DMA and Interoperation of E2EE secure messaging

if the key management systems don't interoperate, the services don't interoperate.

if the trust networks don't interoperate, the services don't interoperate

if i get your matrix messages, and stick them in a plaintext RSS feed you will find out, and i will lose your trust

storms and teacups.....

on the other hand, will this make meta up their game? that's a business decision, which I am not qualified to answer. but i think it might at least create an environment where some services may choose different business models, so that they can (up their game)

some further reading what it may make people try and how to keep it e2ee2e2ee e2ee to e2ee - like it says

"to the extent that the level of security, including end-to-end encryption where applicable, that the gatekeeper provides to its own end users is preserved across the interoperable services"

by the way, many people have most the apps on their devices, so if those apps have open APIs, client side (secure) bridging is trivial (could put it in an enclave/trustzone if super worried about some apps being leaky) - could also use federation to build distribution trees for secure comms (with keygraphs).

Friday, March 25, 2022

Percom 2022 Perfail Workshop Panel

Jon's notes&answers for panel at Percom Perfail workshop on coping with failure...

First, can we talk about negative or inconclusive results more than failure?

1. Eleanor Roosevelt famously said that "Learn from the mistakes of others. You can't live long enough to make them all yourself." – Can you share your research experiences where you faced difficulties and how you overcame them? What are the common mistakes you see researchers make?

- Framing the problem wrong

- Not going far enough back in history of related work (even 10-20 years)

- Choosing right baseline comparisons.

2. Given your many years of experience, what are your suggestions and advice for young researchers on approaching a new research problem/area such that they minimize the risk of failures (in other words, how to publish a PerCom paper every year?)

Along with avoiding above errors, be prepared to refactor even very late in work.

3. What is your advice for handling failures in long-term research studies where changing core methodology is no longer an option (e.g., in measurements, system design, etc.). Similarly, what is your advice for studies where ethical concerns became apparent at a much later stage?

If you are doing high risk research, use a registered experiment publication (e.g. RSOS) which allows for negative results to be published.

https://royalsocietypublishing.org/journal/rsos

If the problem was measurement/design methodology turns out to be wrong, then the fact that it was a large sunk cost must be published to help other people avoid that cost!

If the problem is ethics (e.g. medical treatment turns out to be worse than existing known treatments), stop immediately and still document. (c.f. pharma companies are improving at this).

You actually have an ethical duty to report!

4. For folks with research industry experience, did you find any differences in how failures are handled in industry vs. academia?

Based on 9+ years on advisory board for Microsoft Research: Industry tends to call a halt right away and move to the next problem to tackle.) (also startups)

Most academic research funding agencies still don't recognize the value of failure, so many EU/UK/US projects limp on, and just report that "work was done".

We need to retrain the funding agencies to accept that interesting (i.e. risky) research necessarily has more negative outcomes than positive.

As with papers, a negative outcome (even just "this is not statistically significantly different") is still a contribution to knowledge and NOT a failure. Methodologies not working is also useful knowledge.

5. How would you advise young researchers on handling unexpected results from a study? In your opinion, can the research be salvaged, or is it better to move on and start a new work?

Unexpected is the best!

6. How has your approach towards handling failures changed as you gained more research experience?

We actually had a Dagstuhl Seminar on DiY networking where we spent two days building "failure machines" - see report:

https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=14042

7. What steps can be taken to encourage the discussions surrounding failures/setbacks/learnings in different stages of research? How can early-stage researchers find safe spaces to discuss failures without the fear of judgments (from the advisor, group members, etc.)?

Find local workshops, PhD fora, and also, and present, often!

Shadow Program Committees - e.g. IMC has a call out right now: https://conferences.sigcomm.org/imc/2022/shadow/

which is an excellent place to see papers that get rejected and WHY!

8. Do you have any coping mechanisms/mantras for dealing with rejections/failures (of research papers, grant/tenure/patent applications)?

For both grants and papers:

If you are confident, and there is substantial positive feedback in any reviews, then regroup, refactor, and resubmit.

If the negative feedback really is a showstopper (e.g. work has been done before, or see above - reframing doesn't work etc) then move on to next thing -

Linus Pauling, who got two Nobels, said: “The best way to have a good idea is to have lots of ideas.”

9. What is the one crucial lesson/advice you would like to share with your younger self in the Ph.D. program?

Submit papers/talks/proposals - getting feedback from outside your bubble is vital.

You will always encounter some negative feedback, so the sooner you get used to coping with it, the better.

Tuesday, January 25, 2022

Academic life - just more work on the block chain gang? - citations are just non fungible tokes of esteem: discuss

I was idly thinking about worthless pieces of information, and suddenly struck by the parallel between citations and NFTs. the citation graph is append only (practically never does someone delete a citation, let alone the actual citing paper, or cited work). so it is a very stuff of Merkle Trees. the repositories of record are persistent (e.g. british library, library of congress etc), but independent,, cross checked (c.f. dewey decimal etc) for consistency, so you have decentralised.

and you have proof of work - academic life is (aside from the odd bit of teaching) all about writing (rarely about reading - hint, this could lead to a radical optimisation of the implementation above:)

and we use this nonsense for H-Indices for employment, promotion, honours and the the REF? (oh, ok, you aren't meant to use citation counts for that, but hey, people are lazy).

so what is the exchange rate between BTC based NFTs and paper citations now? why isn't there one yet? we could easily set it up, and all retire on the proceeds....

Tuesday, November 09, 2021

10 reasons I'm not giving you my data

1 there isn't any ( I made it up)

2 I don't understand the tools I use so I have no idea how they got from data to cool graphs

3 I don't have time/money/people to work on making data usable by other people, so I'll just keep it for the next 19 papers we can dice and slice from it

4 you'll find mistakes in my paper

5 you'll find mistakes in my data

6 you'll uncover massive privacy invasion

7 you'll uncover a massive breach of ethics

8 you won't give me your data anyhow

9 I lost the data

10 the policy says I won't get any kudos/promotion if I share data

11 I can't count

Note all these apply to code as well. Note Bene, all these apply to the paper in the first place, it is just more work for your adversary (competitor) to re-do the code and data as well as the paper. So why do people share papers? If mechanisms (incentives) work for that, then same mech should work for data and code, no?

Monday, September 13, 2021

a short tale of a wearly traveller

yesterday, three of us experienced the hassles of having to produce the 5 documents & dependencies on infrastructures required to travel/fly from crete to london during period it is held in amber status by uk, due to covid cases (sep 2021)

1/ passport, on paper (possible e-passport - needed to get through LHR in less than a day:-)

2/ vaccination status (2 QR codes) - on smart phone may need net access to download, if you forgot to do so ahead of time (or it was deleted)

3/ covid test result from within prev 72 hour (scan or paper)- typically given on paper, so you need a camera phone or scanner

4/ passenger locator form (PLF), including reference for

day 2 PCR test booking after return. (so you probably needed email to get that booking)

BA required these to all be uploaded before checkin & issuing

document number

5/ boarding pass (on smartphone or paper), needing access to

printer, scanner or camera phone & network. - I guess they might have accepted them all on paper at the checkin desk...

2 days after return, get PCR test and a day later get results (negative, unsurprisingly). However, a day after that, the three of us that travelled together all get pinged by the NHS Test&Trace to say we have been in contact with someone and now have to get another test - this is stupid, because we had not been together since landing back in the UK, so the only point of contact must have been another traveller on the plane. so we have all gotten negative PCR since then, and the NHS know this, so why ask us for another test? fail to join up thinking. (I'd say it was a race condition, but I've never heard of one that lasted over 24 hours).

The only way we were on the NHS records was via the passenger locator form (since they used both mobile & email to contact the three of us and that's the only place they would get that info) which has the booking info for the day 2 returnee PCR test on it.

Wednesday, August 25, 2021

on trusting trust and the shadows on the wall of the cave

Reading the excellent Your Computer Is On Fire recently, and there's a great chapter revisiting Ken Thompson's rightly famous Turing Award Speech about trusting trust. The chapter also discusses the Wheeler solution to the problem --

in a nutshell, when you use a tool chain for building a computing system, you depend on the tool builders. So an application must be compiled (or interpreted) and runs on an operating system, which runs on hardware which may be networked and so on - it is "turtles all the way down". The Thompson "hack" takes advantage of two things - bootstrapping compilers and quotation, to build systems that build in trapdoors at build time, but in a way that is not visible to simple inspection of the compiler tools (without going back in time to before the hack and before the bootstrap - i.e. introduces a cost of effectively rebuilding your tools ab initio every time to avoid the trapdoor re-insertion two step dance. The Wheeler solution is to find some tools from elsewhere as well and compile your system with those too and compare the results. An alternative is to use trustworthy computing so that the privileges don't increase as you go down the stack, and you can check the integrity of the tools&die as well as apps - but now with attestation, or with multiple toolchains, we have a chain or even a web of trust rather than a stack of trust. We may need a web (or even a blockchain) because we want to mitigate collusion (between key signing agents or between different tool builders, or, obviously both) -

Isn't life complicated...?

Tuesday, August 10, 2021

decentralisation & disintermediation

thinking about the history of peer-to-peer (IP routers, eDonkey, the original skype, and now new things like matrix, mastodon etc) - there are several properties oft conflated together

1. distributed - in your pocket, kitchen, on your bike, etc

2. decentralised - there's no agency, service with a single point of contact, failure, power, etc

3. heterogeneous - and partially federated - implemented by different people, but interoperating

what this also means is that there's no big intermediary - no single platform owner, who has a god's eye view of the proceedings - marketing things or surveilling things. - there could still be such services, but they would need cooperation from all the targets they'd want to hit or spy on.

what is wrong with Uber, Airbnb and (probably) Bitcoin is that while they have some of these properties, they are dependent on single large infrastructures (roads/gas, houses/keys and the electricity grid) - you can build a fully peer-to-peer map of the world and let everyone share their EVs, and you could move all property into collective ownership (gasp), and you can build a decentralised trust system that doesn't depend on proof-of-work, but without that, these systems are fundamentally intermediated by those key infrastructure owners, who could change the operating rules to make what is done infeasible, or just pwn it. i..e their governance is extremely sketchy.

Thursday, July 15, 2021

The internet is made of holes

This Atlantic article by Zittrain suggests that the internet is decaying. I think this is a classic observation error - the internet is like a kids plastic inflatable garden pool that is being blown up bigger and bigger and filled with water the whole time to overflowing - sure, lots of spillage, but also more and more content. and this isn't just a quantitative observation - more and more of the content is curated in various ways the problem is that exponential grown brings both more quality content, but even more (in just pure numbers of, say, pages or photos or ditties) junk in the long tail, which isn't being looked after (think of all the social media content that dissappears when people grow up and delete their (last year's most popular service) account.

sure, the internet is full of holes. that is why the content was organised as a web - the clue is in the name:-)

If "important" stuff is disappearing permanently, often, I think someone would do something about it, and they are...

Monday, May 24, 2021

Photo Id for Voting in the UK

There are about 3.5M people of voting age in the UK who dont have photo id.

May cannot afford a passportt and don't drive so won't get a driving license.

so government proposals to require photo id for voting is

a) unfair on them as the hassle of getting some other voter id will deter some from voting, and is cleatly motivated around which segment of society they are likely from, politically.

b) Plan B is to have local councils generate free, or very cheap, photo id for those people to get on demand. Not a great plan, since such Id will then become a target for fake id (as it is in the USA).

This will increase voter fraud (which currently in the UK runs at about 1 case per election). at a cost of about £20M per annum. brilliantly counter productive.

but also (as experience in NI and aforesaid USA shows) will also be used for age verification, and even ID checks for people making payments, hence increasing fraud there, massively.

Ironically, something the Online Harms bill shoud really be addressing - another piece of pointless

government legislation ust to be seen to be doing "something" for a problem that exists in another country, but not here. doh.

what's in an NHS App QR code that vouches for your vaccine status?

If you've got the NHS app (the one you use for booking appointments, or repeat prescriptions, not the contact tracer one), you can download a vaccine/covid status to it - here's mine, decoded

on it, you see my name & dob and the vaccine dose name, batch number and date, plus it is signed, and can be checked for its legitimacy - there's international protocols (at least for EU, and the UK Is still cooerating on that). If you dont have a phone capable of running the app, you can get a letter from your GP (takes a few days) - not too much data being given away here- you don't need to show the vaccine status being downloaded, you can store it (or get it emailed)and a border person could check it with (presumably) some other app and check name/dob against passport.

the code is valid for 1 month - i.e. it expires, so you then just download (or get emailed) a new one - so long as the vaccine wasn't so long ago that it's efficacy has dimmed (and we dont know how long that is yet for all the vaccines in use) you should just get a new valid QR code or cert (or letter) for another month...

not a lot of privacy threat here....nor is it a huge burden on systems to run something like this...

ref: https://paravirtualization.blogspot.com/2021/05/whats-in-nhs-app-qr-code-that-vouches.html

trust framework: https://ec.europa.eu/health/sites/default/files/ehealth/docs/trust-framework_interoperability_certificates

_en.pdf

<COSE_Sign1: [{'Algorithm': 'Es256', 'KID': b'Key5PRO'}, {}, b'\xa4\x01bGB' ... (350 B), b'\xd1zo\xb3\x1b' ... (64 B)]>
  {
    "-260": {
      "1": {
        "dob": "xxxxxxxxxx",
        "nam": {
          "fn": "Crowcroft",
          "fnt": "CROWCROFT",
          "gn": "Jonathan",
          "gnt": "JONATHAN"
        },
        "v": [
          {
            "ci": "",
            "co": "GB",
            "dn": "1",
            "dt": "2021-02-11",
            "is": "NHS Digital",
            "lot": "EL7834",
            "ma": "ORG-100030215",
            "mp": "EU/1/20/1528",
            "sd": "2",
            "tg": "840539006",
            "vp": "1119349007"
          },
          {
            "ci": "",
            "co": "GB",
            "dn": "2",
            "dt": "2021-04-09",
            "is": "NHS Digital",
            "lot": "ER1749",
            "ma": "ORG-100030215",
            "mp": "EU/1/20/1528",
            "sd": "2",
            "tg": "840539006",
            "vp": "1119349007"
          }
        ],
        "ver": "1.0.0"
      }
    },
    "1": "GB",
    "4": 1624147200,
    "6": 1621341834
  }

----------

import sys

import zlib

from base45 import b45decode

from cose.messages import CoseMessage

import cbor2

import json

qr = input("QR plz: ")

print(qr)

if qr.startswith('HC1'):

qr = qr[3:]

if qr.startswith(':'):

qr = qr[1:]

bin = b45decode(qr)

print(bin)

foo = zlib.decompress(bin)

print(foo)

bar = CoseMessage.decode(foo)

print(bar)

baz = bar.payload

baz = cbor2.loads(baz)

fee = json.dumps(baz, indent=4, sort_keys=True)

print(fee)

-----

reminder of value of contact tracing:-

https://www.nature.com/articles/s41586-021-03606-z

but also of risks:-

https://blog.appcensus.io/2021/04/27/why-google-should-stop-logging-contact-tracing-data/

Friday, May 14, 2021

Proof of Green

so rather than burn the earth even faster in some bogus pursuit of decentralized crypto-currencies (we only have one earth, so bitcoin is inherently centralised around that one fact), why not use renewable resources to generate coins. I don't mean greenwashing where you place your mints next to hydroelectric or geothermal sources. I mean literally use the fact that sources like solar are highly time&space varying - a large solar array could be used to generate signatures (each cell will receive slightly differnt amounts of sunlight over time - the voltage generated from each, therefore varying - this can be logged (e.g. on a blockchain) with GPS coordinates (now feasible down to centimeter accuracy courtesy of new devices), and acts as a unique coin value. This can be measured and verified by other parties. It costs almost nothing to mint, and is a side effect of building more renewable (solar) energy sources, rather than a pointless consumer of them.

see the light!