A True History of the Internet

Saturday, July 12, 2025

a brief history of asking for forgiveness versus permission - the napsterisation of AI...

back in the day, the Jesuits used to say that it was better to ask forgiveness than permission. I think that this may refer to the idea that people may have committed minor errors without knowing what they did was wrong, so they were less blameworthy, especially if, after the event, when the priest or other wise person explained to them the error of their ways, they recanted and were forgiven. To ask permission implies that the answer might be "no".

So now we are in a. world where people are being paid to run stuff like this legitimised botnet, effectively becoming part of a P2P file sharing world. Once upon a time (a generation ago, or almost infinitely in the past) if you ran a thing like this (the Napsterised Internet) you would get sharp letters from lawyers or even just be fined by the copyright infringement police.

Post Napster, Google acquired Youtube and took an interesting step...they basically took the Jesuit line, with a vengance - the trick was that Google went and did massive deals with all the large copyright owners (actually paying quite serious money) and then if you or I uploaded something already covered by that agreement, then no problem. If we uplaoded something not yet covered by the agreement, Google had an offer - they could offer advertising revenue, or possibly market research (popularity metrics), or as a last resort, take down the content.

While the large copyright owners have not been the best of friends to the artists who actually create stuff, this was at least semi-legitimate (I'm not a lawyer, obviously, but it seems to follow the aforesaid Jesuit model, and that has history behind it:-)

Now we have all those GenAI tools trained on a lot of content that is avaialble on the non pay-walled Internet - this does not mean it isn't copyrighted. The AI/LLM companies are notably trying to claim fair use type arguments (which search engines 20 years ago most notably did not) - the difference may reflect a change of culture, a shift in legal interpretation (of say fair use) or perhaps simply a shift in power (AIs owned by companies that have a larger market cap than the GDP of most countries).

At least one of those AIs is run by the aforesaid search engine company. But others are not, and don't necessarily have search, and certainly don't appear to have done the large deals for content with those big copyright owning companies...

So the game is afoot... ... ...

Wednesday, July 02, 2025

from dyson to dolby

was at a thing in imperial college in their dyson center then went to the new cavendish (3.0) lab in cambridge's new dolby building, and got worried about the idea that these might both be about noise cancellation. obviousdly vacuum cleaners make sure that your vacuum is really really high quality so sound won't propagate at all, and dolby is all about boosting the signal and reducing the noise

but what happens if we combined these technologies, I hear you cry. Actually, I don't because that would be like the eponymous xenomorph after ripley kicks it out of the nostromo. All you can hear is the irritating sound track music.

Monday, June 16, 2025

LLMs: Social Engineering for Dummies

Has anyone tried to get GenAI to build Cyberattacks based on fooling humans via cognitive biases etc?

I mean it seems like prompt engineering really good suggestions for non technical (i.e. psyops) would be a really good test, especially for agentic AI systems, to see if they had even a half baked theory of mind...

don't say you havn't been warmed...

Wednesday, May 21, 2025

autonomy & a commons...

for entities to have autonomy (i.e. be agentic) they must operate in some sort of decision space that is disjoint from the other entities with which they interact - conversely, if their decisions are fully specified by those other entities, then they are by definition not autonomous..

we could call this an agentic commons

Binding Real Live Human to Virtual Identifier with data minimisation - Ф-ML-Id

People use hashes of biometric data embedded in identifier (digital documents) so then the verifier can re-acquire the biometric and (hopefully, in a privacy preserving way) verify the person standing in front of the camera is a) live and b) the same as the one in the document - this is so 20th century

Why not have a behaviour that actually relates to the verifier's actual domain requirements - say they want to check this person is allowed to drive - perhaps for renting a car - so they could measure the person's driving style, which could also be stored in their digital driving license shard of their identity. This could be very robust - graduatlly improves in uniqueness -- and also stops people re-linking across different domains when people use the same feature/behaviours for multiple roles (face, being a common example) - we can also incorporate increasingly robust defenses against GenAI building deepfake people (via deepfake 3D faces or voices) -

The verifier would then be an agentic AI which basically has two things, a measurement model (classic ML) and a physics model (of what people can do) - so now we have Ф-ML-Id...

[lots of good background on Ф-M from the Turing's interest group]

The more borng application is in multimodal biometric systems tat use face+voice+lipsynch on given phrases to work out who someone is and that they are alive- same thing - physics model predicts face movement and voice audio given text input, then verifies against previously onboarded voice&Ф-M 3D face model.

In facial reconstruction (e.g. taking oliver cromwell's skull and building up a picture of what he actually looked like in person) there's a model of how flesh covers bone - this includes models of the way the stuff we are made of folds/hangs/ages etc - these models are from first principles. Unlike deep learning systems trwained on lots of labelled data to classify faces and extract features (eyes nose mouth etc) purely based on statistics and luck, these physics models are correct because that's the way the universe works. You can build hybrid, so called Ф-ML systems, which give you benefits of both the statistical approach, and the explainability of the physics model - the cool recent work also shows that the physics model lets you reduce the amount of data necessary for the statistical model - sometimes by 2 or 3 orders of magnitude, and retain the accuracy and recall of the stats model.

In the world of biometric id, there are requirements from applications (many use cases are with a human waiting for verification of some attribute) that mean you want fast, accurate and efficient models that can run on affordable devices in tolerable time.

You also want to be future proof against deep fake, and also against adversaries with access to complex system wide attacks.

I claim that having these underlying real world explanatory component, alongside the statisitcally acquired twin, will be more resilient, and might even let you cope with things like kids and aging and other changes, as well as allowing you to verify attributes other than the standard biometrics of face, fingerprint, iris etc, in robust ways which provide better domain specific data minimisation.

Friday, May 16, 2025

Reflections on Distrusting Decentralization

with apologies to ken thompson

I keep hearing a lot about how decentrlised systems can solve the massive loss of trust we are witnessing in large scale central organisations (technological, like hyperscale companies, and social. like national governements, and even economic, like banks).

I call 100% bullshit on this

(said in Natasha Lyonne's great voice, maybe we could ask her to do an audio book of Frankfurt's book on bullshit, or even graeber's book on bullshit jobs).

not that many people havn't lost trust in central agencies. that's been a factor of life for millenia - probably amongst many creatures, not just humans - i imagine the first deer to get hunted by humans suddenly found another threat to add to their collection. and tribes and nations invaded suddenly, or betrayed by "friends", "family", states in treaties etc. or banks going broke. or rivers and mines drying up.

sure, central things can lose trust, and, as the cliche has it, they will find it much harder to regain.

(though notice, people don't say often that it will never get regained, just that it is a lot harder to gain than to lose).

so what about decentralised systems and trust?

well, the answer is built into that oxymoron. a "system" isn't decentralised. it represents, at minimum, an agreement to follow a set of rules. Who makes that agreement? Answer, the participants. They buy into the API, the protocol, the rules of engamgent and behaviour. And they can renege on it. They can stop obeying the law, they can start a run on the bank, they can over-graze the commons, they can get together and gang up on a smaller group, they can go amok, or form a mob, or start a revolution.

At some point the decentralized system must have some central component, even if it is virtual.

So why would I trust such a system more than a central system? I don't think I should. The problem is that there are no coordination points which means if I am in that minority, even just one person being ostracized, I have no redress, no recourse to recompense. There's no resource I can point to, to offset the misuse of decentralized power by the unruly mob.

In syndics (socially anarchic systems), there are meta-rules of engagement that are supposed to mitigate misbehaviour. for example, you are not supposed to engage with people (nodes in the net) with whom you have no overlapping interests (i.e. no resource conflict, no need to enage). If you do, then the metarule makes that mishaviour in everyone's interest. Now there should be a people's court to try the misbehavioung group and decide on a suitable redress (which might be ostracism) - sounds tricky? yup. did it ever work? Maybe for a short time in Catalonia a long long time ago.

How would that work in a distributed communications (or energy) system? Not well if you ask me. we only have one "space" for comms, and we only have one planet. There's got to be some root of trust (or multiple roots is fine), so you can anchor redress (for example).

Of course, you can build a hierarchy, which at the leaves looks like a decentralised system, but really, what you have is federated.

Sunday, May 04, 2025

uncanny cycle, hype valley - you choose

The cliche of the tech world is to trot out the infamous Gartner Hype Cycle and no-where is this more prevalent than in AI, post Chat-GPT (to be fair, post "Attention is all you need").

But the other slightly less worn hype is the phase that embodied (or perhaps even virtual) AI is said to go through, which is the Uncanny Valley

So where do these two curves collide, eh? Right about now...

Monday, April 28, 2025

zero trust

I'm pretty sure this sort of thing happens because of bad parenting - people that don't trust anything come up with the idea of distributed systems that have no need of any anchor for trust anywhere.

so i have lots of problems with this - starting from systems - and the classic ken thompson reflections on trusting trust. These deecntealization extremists have to confront that they run software on hardware, and even if they build their own hardware and write their own software, they probably use an OS and a Compiler from somewhere else. However, it gets worse. Why should we trust the actual zero knowledge protocols they use? who has verified them, and how? why do we trust those verfication tools (peoples' brains too). And worse still. Why should we trust this new fangled idea zero. The Romans and Greeks and ancient mesopotamians got along fine without it.

No. I have zero trust in zero trust.