Monday, November 20, 2023

scholastic parrots

 having a conversation in the Turing with my mentor and discussing whether LLM is just AGI because AGI is "just" statistics, and also "just passed the Turing Test"....and we both observed that most interactions we have with other GIs (human intelligences) are pretty dumb.

so my main concern with this is the usual repetition of the Theodore Sturgeon comment about most SF being pretty terrible, and he responded with "most everything is pretty terrible". Intelligence is rare - most GIs can exhibit it, but only do so very occasionally, as intelligence is really not often very useful - habit is much more useful (thinking fast, rather than slow, is a survival trait according to kahenman and tverski).

so like many things, smartness is zipf/heavy tailed - 

the title of this entry refers to scholarly works - most papers are cited less than once. A few papers get tens of thousands of citations.

So you train an LLM on the common crawl, or on the library of congress, and the majority of stuff you've trained it on isn't even second class, it is just variations of the same thing.

This isn't model collapse - this is an accurate recording of a model of what most people's visible output looks like. Dim, dumb, and dumber. So what?

Well, going back to the Turing test, if you, an Average Joe, pick an LLM at random, prompt it with some average prompts and compare it to the average GI, you will unsurprisingly conclude the LLM has passed the turing test.

But what if you had Alan Turing (assuming still alive) at the other end of the GI teletype, I ask? and what if you got Shakespeare and Marie Curie and Onara O'Neil to ask some questions of it and the LLM.

Then I suspect you'd find your LLM was a miserable failure, like the rest of us. Except that every now and then, we rise to the occasion and actually engage our brains, which it cannot do.

Tuesday, October 24, 2023

In-network processing - do we ever really need it?

 We've looked at this problem from several sides now - to solve the "incast", to do aggregation for map/reduce or any federated learning platform, to aggregate acknowledgements for PGM.

When we say "in-network", we're talking about in-switch processing - borrowing resources from the poor P4 switch to store and process multiple application layer packets worth of stuff, so that only one actual packet (or at least a lot less) needs to be sent on its way.

So how about we compare with multicast (in network copying) and its (largely) replacement by CDNs/overlays.

Key point is branches in the net - this is where the "implosion" (for incast) or "explosion" (for multicast) happens:

So do we have a server nearby? Or can we just put one there (or just connect one there?

Answer is (for multicast yes:

netflix/pops in wide area - use distribution trree to all pops, and caches

So in data center: 

use servers, not switches and build sink forest of trees

clos system, connect servers to local switch, top of rack, and spine switch/server...then for servers at some level, use a node at the next level up as aggregation server (note Clos even has redundancy so this will survive edge/switch outages)

Friday, October 13, 2023

Unseeing like a State

 Just read Seeing like a state, by James C. Scott.  Incredible scope and vision for what is often, but not always) wrong with "tech" led solutions - though in a very broad sense. - looks at imposition of regularised/normalized villages, farming, transport, city structures and so on, especially by "developed" world on the (frequently) completely inappropriate contexts of colonies but also post colonial, self imposted. From russian collective farms, to modernist cities like Brasilia, from mono-culture farming to single-minded, wrong-headed cultural impositions, an amazing read!

It basically makes it pretty obvious why the following stuff happens:-

Tim Wu's Eyeball Bandits

Ian Hislop's Fake News

Doctorow's Drain Overflow

Basically, the Internet users are the hunters and gatherers that just got fenced in and collectively farmed, like ants. Grate.

Monday, September 25, 2023

boxing clever with AI

 There was this AI creative challenge where the you had to figure out things to do with 4 objects, as follows:

A box, a candle, a pencil and a rope

Here's my 3 proposals:

1. Draw a still life on the box of the candle and the rope so that it looks like 3D (i.e. draw on all 6 sides of the cube, with the pencil)

2. make a clock out of setting fire to the candle, the rope and the pencil - they will burn at different rates and you could mark out the seconds, minutes and hours with box lengths, then sit on the box, passing time

3. Have a boxing match between the pencil and the candle, in a ring made by the rope.

Thursday, September 21, 2023

dangerous AI piffle...

 So what's a dangerous model?

The famous equation, E=mc^2 is dangerous - it tells you about nuclear power, but it tells you about A-bombs too.

This famous molecular structure dangerous too - it tells you about DNA damage, but it tells you about eugenics too.

[picture credit By Zephyris, CC BY-SA 3.0,]

So we had Pugwash and Asilomar, to convene consensus not to work on A bombs and not to work on recombinant DNA. Another example - the regulator has just approved exploiting the RosebankUK oilfield, despite that solar and wind power are now cheaper than fossil fuel, and that COP26 made some pretty clear recommendations about not heating the planet (or losing biodiversity) any more.

What would a similar convention look like for AI? Are we tallking about not using Generative AI (LLMs, Stable Diffusion etc) to create misinformation? really? seriously? that's too late - we didn't need that tech to flood the internet and social media with effectively infinite amounts of nonsense.

So what would be actually bad? well, a non explainable AI that was used to model climate interventions and led to false confidence about (say) some Geo-Engineering project, that made things worse than doing nothing. That would be bad. Systems that could be inverted to reveal all our personal data. That would be bad. Sytems that were insecure and could be hacked to break all the critical infrastructure (power, water, transportation, etc) - that would be bad. So the list of things to fix isn't new - it is the same old things, just applied to AI like they should have been applied to all our tech (clean energy, conserving bio-diversity, building safe resilient critical infrastructures, verifiable software, just like aircraft designs etc etc)...

n.b. the trivial Excel error that led to UK decision to impose austerity, that was exactly incorrect:-

Recall the Reinhart-Rogoff error:

So dangerous AI is a red herring. indeed, the danger is that we get distracted from the real problems and solutions at hand.

Late addition:- ""There's no art / to find the mind's construction in the face."

sad Duncan, ironically, not about Macbeth...

So without embodiment, AI interacts with us through very narrow channels - when connected to decision support systems, it is either via text, images or actuators, but there is (typically) no representation of the AI itself (it's internal workings, for example) so we construct a theory of mind, about it, without any of the usual evidence that we rely on (construction in the face...) to infer intent (humour, irony, truth, lie etc)

We then often err on the side of imparting seriousness (truth, importance) to the AI, without any supporting facts. This is where the Turing test, an idea devised by a person somewhat on the spectrum by many accounts, fails to give an account of how we actually interact in society.

This means that we fall foul of outputs that are biased, or deliberate misinformation, or dangerous movements, far more easily than we might with a human agent, where our trust would have to be earned, and our model of their mental state would be acquired over some number of interactions, involving a whole body (pun intended) of meta-data.

Of course, we could fix AIs so they did this too - embody them, and have them explain their "reasoning", "motives" and "intents"... That would be fun.

Monday, August 21, 2023



Plenty can and has been said about networks (&systems) for AI,  but AI for nets, not so much.

The recent hype (dare one say regulatory capture plan?) by various organisations for generative AI [SD], and in particular LLMs has not helped. LLMs are few shot learning that make use of the attention mechanism to create what some have called a slightly better predictive text engine. Fed a (suitably "engineered") prompt, they match an extension database of training data, and emit remarkably coherent, and frequently cogent text, at length. The most famous LLMs (e.g. ChatGPT) were trained on the Common Crawl, which is pretty much all the publicly linked data on the Internet. Of course, just because content is on the common crawl doesn't necessarily mean it isn't covered by IP (Intellectual Property - patents, copyrights, trademarks etc) or indeed isn't actually private data (eg. covered by GDPR), which causes problems for LLMs.

Also, initial models were very large (350B dimensions) which means most of the tools & techniques for XAI (eXplainable AI) won't scale, o we have no plausible reason to believe their outputs, or to interpret why they are wrong when they err. Generally, this causes legal, technical and political reasons that they are hard to sustain. Indeed, liability, responsibility, resilience are all at risk.

But why would we even think of using them in networking?

What AI tools make sense in networking?


Well, we've used machine learning for as long as comms has existed - for example, training modulation/coding on the signal & noise often uses Maximum Likelihood Estimation to compute the received data with best match.

This comes out of information theory and basic probability and statistics.

Of course, there are a slew of simple machine learning tools like linear regression, random forests and so on, that are also good for analysiing statistics (e.g. performance, fault logs etc)


But also traffic engineering has profited from basic ideas of optimisation - TCP congestion control can be viewed as distributed optimisation (basically Stochastic Gradient Descent) coordinated by feedback signals. But more classical traffic engineering can be carried out a lot more efficiently than simply using ILP formulations on edge weights for link state routing, or indeed, load balancers.

Neural Networks can be applied to learning these directly based on past history of traffic assignments. Such neural nets may be relatively small so explainable via SHAP or Integrated Gradients.

Gassian processes 

Useful for describing/predicting traffic, but perhaps even more exciting is Neural Processes which combine stochastic functions and neural networks, and are fast/scalable, and being used in climate modeling already, so perhaps in communications networks now? Related to this is Bayesian optimisation.


Causal inferencing (even via probabilistic programming) can be used for fault diagnosis and has the fine property that it is explainable, and even reveals latent variables (and confounders) that the users didn't think of - this is very handy for large complicated systems (e.g. cellular phone data services) and has been demonstrated in the real world too.

Genetic Algorithms

Evolutionary Programming (GP) can also be applied in protocol generation - and has been - depending on the core language design, this can be quite succesful. Generally, coupled with some sort of symbolic AI, you can even reason about the code that you get.


Of course, we'd like networks to run unattended, and we'd like our data to stay private, so this suggests unsupervised learning, and with some goal in mind, especially, re-enforcement learning seems like a useful tool for some things that might be being optimised.

So where would that leave the aforementioned LLMs?

Just about the only area that I can see they might apply is where there's a human in the loop - e.g. manual configuration - one could envisage simplifying the whole business of operational tools (CLI) via an LLM. But why use a "Large" language model? there are plenty of Domain Specific (small) models trained only on relevant data - these have shown great accuracy in areas like law (patents, contracts etc), user support (chatbots for interacting with your bank, insurance, travel agent etc). But these don't use the scale of LLMs nor are they typically few shot or use the attention mechanism. They are just good old fashioned NLP. And like any decent language (model) they are interpretable too.

Footnote SD: we're not going to discuss Stable Diffusion technologies here - tools such as Midjourney and the like are quite different, though often use text prompts to seed/boot the image generation process, so are not unconnected with LLMs.

Monday, August 07, 2023

re-identification oracle

 surely, chatgpt should be a standard piece of any attempt to show whether allegedly anonymised data is?

effectively it is a vantage point from which to triangulate (any and almost every angle)...

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home