Tuesday, May 10, 2022

asymmetric power and language warfare

 so the GPT-3 API release blog post(but not models) from OpenAI does some virtue signalling about the possibility of misuse of the underlying models for disinformation. I'm not sure that washes (in the ethics sense) in that there's nothing to stop them being hired by someone do evil for money - only if they had a radical governance model could they avoid the "maximise shareholder value" mantra/fate, surely. And note they are not the only game in town, even if they have that wonderful governance model - there's Google's new palm as well as the BAAI Wu Dao - there's quite a few organisations with access to hyper-scale cloud compute these days, so really the geni is out of the bottle. maybe we need a new global governance - start with models like Asilomar or Pugwash, but then legislate? Perhaps the EU could lead the way by refining some of its rather shotgun AIA rhetoric?

One problem I have with the framing above is that I am not clear what exactly  these near trillion-parameter "models" actually are - most simpler AI (including recently, some smaller neural nets) can offer explainability (e.g. reflect on which features in input are the cause of particular outputs, and why) - this is welcome as it brings them into the same body of work as much of earlier basic statistics (including the simplest form of ML, linear regression and random forests) - there are good engineering reasons to have explainability whether the application of the tech is in, e.g. plain old engineering (autopilots) or health, but especially so when the domain is very human facing, such as (e.g.) law and language.

As mentioned in a recent meeting, I think the social media platforms, with their combination of various news feed ordering algorithms, adverts, filters ("do you mean to retweet that article before reading it", etc etc), basically constitute large language models already deployed in the wild. The idea of "not connecting your LLM to a social media platform" is out of date - meta et al already did. Given the toxicity of such systems, it seems obvious that we should have a Butlerian Jihad against these systems right now.

Thursday, May 05, 2022

The Robot Who Smelled Like Me

imagine a robot that was so like you that when it encountered certain smells, it was cast back in time to a certain memory of a place or an incident or a person? 

but then sense of smell is known to have quantum level effects, so perhaps there would be an entanglement, or perhaps, just maybe, there was an entanglement, but that would no longer be (no cloning!) and you would forget.

another reason to fight against simulacra?

Monday, May 02, 2022

We are not living in a simulation.

you can't breath data.

you can't drink code.

there's no sustenance in cpu cycles

there's no fond memories in RAM or SSD.

flash memories don't last.

threads are soon all bare.

we are not living in a simulation.

though we might be one.

Friday, April 01, 2022

Disruptive Times

 Welcome to the first issue of the Disruptive Times, of Brussels and London

We live in disruptive times, and no less so because humanity has made many unsustainable choices socially and economically, as well as technologically.

The global pandemic was a result of a combination of unsustainable food and travel culture

The hike in energy and food prices is caused by unsustainable political organisations leading to invasion of the wheat belt of Europe by its gas&oil supplier.

The Internet is hovering on the brink of switching from unsustainable centralised cloud/web services, to even less sustainable decentralised services which can only be made trustworthy by proof-of-work, which  is not something to be considered by anyone wishing their kids to have dry land, reliable food supplies, and personal safety.

This all is the exacerbated by small world networks, combined with Zipf (power law) distribution of resources, which over time, given the  myopic view of global capitalism, or the personal greed of centrally planned economies and power base of China and Russia, concentrates wealth in smallere and smaller subsets of the population. This is structurally unstable, unsafe and unsustainable. The increasing fraction of the population deprived of access to adequate supplies to live (whether dry land, housing, food, education, healthcare or just plain wellbeing), will eventually run out of patience, or the planet will fail, or both.

The way forward for technology is federation - a federation of many small to medium sized systems has locality, and can be dimensioned matched well to local supply and demand. It reduces the immense waste of energy moving information (or other resources) to and fro across the world, increased engagement, ownership and control and therefore privacy, resilience, and reduces latency. Resilience can be provided by occasional mirroring of content to neighbours in the federation, which is far less wasteful than continuous movement, and can largely happen at idle times, and doesn't require copies to be online most of the  time. 

Federation, as an idea, can also be applied to business models - subsidiarity is our friend - local involvement keeps people interested, asa we know from politics. But economically, having skin inis the game (collective ownership) is an even stronger incentive - farmers used to build barns together, savings and loans companies (credit mutuals) were not for profit, and benefited all the investors or borrowers alike. So with the Internet. This is not "free", communistic, or even the old "peer-to-peer"

Examples like Guifi.net (and matrix.org or dataswift.io) how we can create operational and governance  models that have new and old elements- communities of interest bound via information shared through shared infrastructure can be sustained at the human and technical level, without the short sighted, destructive notion of maximising shareholder value that destroys even the most ideal for-profit organisations - 

Next edition we can talk about some of the technical challenges of federating systems, including end-to-end assurances (both of delivery, integrity, and of confidentiality, and in the end, of trustworthiness.


Note many systems in the Internet were federated - routing (through BGP), names (through DNS) keys (through certificate transparency), the Web (originally) through unidirectional URLs,  The move to centralisation came about accidentally through the notion of search. While decentralised search worked (e.g. finding content in peer-to-peer systems), finding content in the one, single, global Worldwide Web, created incentives for people to try and boost their site's ranking in search results if those people had a business to attract customers for, An idea from 1972, inverse document frequency (IDF) weighting in information retrieval, was rebooted as aPgerank, which simply counts the in-links to a site - since that depends on other sites, but not on the site itself, it is almost a democratisation of the notion of popularity of the  site (though static, rather than counting, say, actual visits). From this, it is  a short step to monetizing this by a) selling advertising of things relating to the site b) measuring the visits ("clickthrough") to adverts so as to set the price charged to the advertiser.

While this could be done in the decentralised or federated, peer-to-peer world, it wasn't, and through nothing more than luck and laziness, many other services grew up to copy this business model, even when the adverts were associated with a non federated (i.e. non web) content, such as tv on demand, or social media sites. No excuse there then.

Tuesday, March 29, 2022

DMA and Interoperation of E2EE secure messaging

if the key management systems don't interoperate, the services don't interoperate.

if the trust networks don't interoperate, the services don't interoperate

if i get your matrix messages, and stick them in a plaintext RSS feed you will find out, and i will lose your trust

storms and teacups.....

on the other hand, will this make meta up their game? that's a business decision, which I am not qualified to answer. but i think it might at least create an environment where some services may choose different business models, so that they can (up their game)

some further reading what it may make people try and how to keep it e2ee2e2ee e2ee to e2ee - like it says

"to the extent that the level of security, including end-to-end encryption where applicable, that the gatekeeper provides to its own end users is preserved across the interoperable services"

by the way, many people have most the apps on their devices, so if those apps have open APIs, client side (secure) bridging is trivial (could put it in an enclave/trustzone if super worried about some apps being leaky) - could also use federation to build distribution trees for secure comms (with keygraphs).

Friday, March 25, 2022

Percom 2022 Perfail Workshop Panel


Jon's notes&answers for panel at Percom Perfail workshop on coping with failure...

First, can we talk about negative or inconclusive results more than failure?

1. Eleanor Roosevelt famously said that "Learn from the mistakes of others. You can't live long enough to make them all yourself." – Can you share your research experiences where you faced difficulties and how you overcame them? What are the common mistakes you see researchers make?

- Framing the problem wrong

- Not going far enough back in history of related work (even 10-20 years)

- Choosing right baseline comparisons.

2. Given your many years of experience, what are your suggestions and advice for young researchers on approaching a new research problem/area such that they minimize the risk of failures (in other words, how to publish a PerCom paper every year?) 

Along with avoiding above errors, be prepared to refactor even very late in work.

3. What is your advice for handling failures in long-term research studies where changing core methodology is no longer an option (e.g., in measurements, system design, etc.). Similarly, what is your advice for studies where ethical concerns became apparent at a much later stage?

If you are doing high risk research, use a  registered experiment publication (e.g. RSOS)  which allows for negative results to be published.


If the problem was measurement/design methodology turns out to be wrong, then the fact that it was a large sunk cost must be published to help other people avoid that cost!

If the problem is ethics (e.g. medical treatment turns out to be worse than existing known treatments), stop immediately and still document. (c.f. pharma companies are improving at this).

You actually have an ethical duty to report!

4. For folks with research industry experience, did you find any differences in how failures are handled in industry vs. academia?

Based on 9+ years on advisory board for Microsoft Research: Industry tends to call a halt right away and move to the next problem to tackle.) (also startups)

Most academic research funding agencies still don't recognize the  value of failure, so many EU/UK/US projects limp on, and just report that  "work was done".

We need to retrain the funding agencies to accept that interesting (i.e. risky) research necessarily has more negative outcomes than positive. 

As with papers, a negative outcome (even just "this is not statistically significantly different")  is still a contribution to knowledge and NOT a failure. Methodologies not working is also useful knowledge.

5. How would you advise young researchers on handling unexpected results from a study? In your opinion, can the research be salvaged, or is it better to move on and start a new work?

Unexpected is the best!

6. How has your approach towards handling failures changed as you gained more research experience?

We actually had a Dagstuhl Seminar on DiY networking where we spent two days building "failure machines" - see report:


7. What steps can be taken to encourage the discussions surrounding failures/setbacks/learnings in different stages of research? How can early-stage researchers find safe spaces to discuss failures without the fear of judgments (from the advisor, group members, etc.)?

Find local workshops, PhD fora, and also, and present, often!

Shadow Program Committees - e.g. IMC has a call out right now: https://conferences.sigcomm.org/imc/2022/shadow/ 

which is an excellent place to see papers that get rejected and WHY!

8. Do you have any coping mechanisms/mantras for dealing with rejections/failures (of research papers, grant/tenure/patent applications)?

For both grants and papers:

If you are confident, and there is substantial positive feedback in any reviews, then regroup, refactor, and resubmit.

If the negative feedback really is a showstopper  (e.g. work has been done before, or see above - reframing doesn't work etc) then move on to next thing - 

Linus Pauling, who got two Nobels, said: “The best way to have a good idea is to have lots of ideas.” 

9. What is the one crucial lesson/advice you would like to share with your younger self in the Ph.D. program?

Submit papers/talks/proposals  - getting feedback from outside your bubble is vital. 

You will always encounter some negative feedback,  so the sooner you get used to coping with it, the better.

Tuesday, January 25, 2022

Academic life - just more work on the block chain gang? - citations are just non fungible tokes of esteem: discuss

I was idly thinking about worthless pieces of information, and suddenly struck by the parallel between citations and NFTs. the citation graph is append only (practically never does someone delete a citation, let alone the actual citing paper, or cited work). so it is a very stuff of Merkle Trees. the repositories of record are persistent (e.g. british library, library of congress etc), but independent,, cross checked (c.f. dewey decimal etc) for consistency, so you have decentralised.

and you have proof of work - academic life is (aside from the odd bit of teaching) all about writing (rarely about reading - hint, this could lead to a radical optimisation of the implementation above:)

and we use this nonsense for H-Indices for employment, promotion, honours and the the REF? (oh, ok, you aren't meant to use citation counts for that, but hey, people are lazy).

so what is the exchange rate between BTC based NFTs and paper citations now? why isn't there one yet? we could easily set it up, and all retire on the proceeds....

