A True History of the Internet: May 2023

Wednesday, May 31, 2023

autonomous vehicles with a moral compass

AI ethicists got stuck up a blind alley with the trolley problem.

Any autonomous vehicle with a true moral compass would:

a) block human driven vehicles from making progress, as human errors in driving are the cause of many deaths every day on the roads.

b) stop and ask the passengers to get out and walk, as the pollution from vehicles (until we are able to generate all energy used renewably, and until manufacture of such vehicles is net zero) is accelerating global heating which is starting to threaten mega-deaths, so we need to change our life styles rapidly.

Of course, such behaviour would lead many humans to "other" the AI - given we can't tolerate human climate protestors (they "disrupt" things too much for commuters - boo hoo - wait til they see a real cimate emergency hit these shores) - we will learn to treat them the way we treat foreigners (illegal aliens).

SF taught us this all so many times already decades ago.

Tuesday, May 30, 2023

Loopy AI versus the Human

At a conference recently, I heard someone propose the use of AI in a way that seemed to me to be exactly wrong.

At passport control in some countries, you have to present yourself for a photo, and right and left hand fingerprints (sometimes, thumb too). The complaint/motive was that the system asks you for these things in a certain order, and communication between immigration officers and visitors may be tricky due to language, culture, jet lag, etc etc - so the idea was to replace the communication from the immigration officer, to the visitors, with an AI that could figure out what you are doing wrong and tell you to do the right thing.

However, the idea that humans should fit in with the AI is, to me, abhorrent. What could be done better?

Well, obviously, the order in which you present these factors/attributes is irrelevant - the camera, left and right hands can be done anyhow, and the AI can detect what hand is offered (from finger lengths or from camera image) trivially - this is then simpler for the human. That's what machines are for: to simplify life from unnecessary, pointless, and trivial burdens. One could go a lot further of course. The visitor has presented a passport - perhaps that has fingerprints on it already and those could be used. If the photo matches the person in front of the camera, there's no need to take another picture, just read the e-passport (via NFC etc), and take a copy of the image there (or scan the one on the printed passport page...).

If one wants to go further, one can query the passenger manifest for international flights (it's part of anti-terrorism anyhow) and see what seat people had and who they sat near, and also measure the amount of sweat on the passport and see if the passenger/visitor is nervous etc etc and be completely creepy.

The main point here is that AI is not an excuse to automate a stupid process. It is an opportunity to re-think the process to make it more human friendly.

Thursday, May 25, 2023

Citizen-centric federation of digital services in the UK.

We have a number of services that many UK citizens already access online, and hence those citizens have access to information held about them - e.g.

private

School/college/workplace based intranet/cloud/VPN etc

communications

Internet Service Provider, mobile/cell, etc
Postal address

government

NHS
HMRC, DWP
DVLA

commercial

Social media (Meta/Twitter/mastodon)
Messaging (email/gmail/hotmail, whatsapp, signal,matrix)
Entertainment (Netflix, youtube)
Media (bbc, legacy web news )
Shopping & delivery (amazon, boots, tescos, ocado, deliveroo/uber)
Travel (rail/metro etc)

Financial

Banking (HSBC, Revolut etc)
Mortgage/savings/loans

All of these require secure sign on to use full facilities. So we have multiple digital identities in the UK.

Some share sign-on (e.g. via facebook or gmail) and even via 2FA (Google/Microsoft authenticator or SMS,

Many people now use password managers or wallets to store account info including pass words/phrases etc, so from the human/user experience viewpoint, this complexity can be hidden at the access level.

However, few apps today allow management of data across all these domains, neither for service provider (whether commercial or government) but also not for the data subject, the end user, the citizen.

A few exceptions point the way forward - just for example, lets look at the thirdfort app, used for example by lawyers gathering information about possible mortgage borrowers, including standard information needed to do KYC (no your customer) and anti-money-laundering checks. This app (and any other like it) can use NFC on a smart phone to read your physical driving license or passport or just use the camera to take a picture and then OCR to get the text data from the id (which might include legacy paper information such as birth, marriage certificates etc), and then uses open banking to access (data minimised Appropriately) credit information (with permission from the client).

Note that these rely on standard interfaces (APIs) for NFC and document formats, and for banking - but they do not need a single, centralised global identity. They build on an eco-system to provide the service.

They work by federating information across services, but are rooted in the end user/subject. It is a relatively easy step to see how such app architectures could be used to combine health )NHS app access to my record) and say, shopping (advice from health on what food for example) or travel and media. or finance and education etc etc

There is simply no need for a national identity - especially not a card. Indeed, one can get smart phones good enough to run the apps I've mentioned for under £50 now. For inclusivity, giving a smart phone to citizens that cannot afford such a device is massively more beneficial compared with blowing the money on a single purpose centralised service, and less expensive.

The main thing is for the government to grasp the opportunity by publishing APIs for services, and the format (metadata) for the information contained there - we've seen the success of this in transport publication of timetable and live data and in the DVLA case where services for renting/buying/selling/taxing/mot cars are made much smoother for the end user and for traders too.

By de-coupling the services from the identity by allowing heterogeneity and diversity, we allow adoption and integration of silo-busting applications, based around the end-user/citizen.

Footnotes -

An example of such a digital service could have benefitted EU citizens that wished to remain in the UK, but were required to retrieve information from multiple places (border force records of trips in/out of UK were not available, shocking given claims for border control to increase national/travel security) but being able to show tax return and employment status, and residence information was feasible for most people via (mostly) open APIs or at the worst, download of data and printing. So the aggregation of data from multiple government and NGO sources in the app is a compelling case for federation, not a single system.

Previous centralised, single system approaches to issuance of foundational Id have dismally failed in the UK, also in Nigeria (3 times each) - the main exception to this observation is, perhaps, the Indian Aadhaar (UIDAI) system, which covers 1.2 billion citizens already there.. However, this was in a country where a significant fraction of the population do not have smart devices. And the applications of the Indian Identity systems (functional Id) were not in place for some time. In the end, the most comelling has been for payment systems, but this would not be a priority at all in the UK, where most citizens already have (mobile) banking, and so it isn't an incentive for people in the UK to adopt any unified identity. Integration of applications that process personal data is much more persuasive.

Couple of caveats - we may want to implement a reliable key management system , but it should be citizen centric, and thus needs careful thought to deal with key recovery - Shamir key sharing would work - one can split the key across multiple (state and private and social) circles, and only need say 3 out of 10 to answer to get a key back. similarly, we can replicate copies our (encrypted) data from the different shards (services) across other services, for high avaialbility, recovery from outage/loss - but need to use this sharded key system to make those copies safe.

Tuesday, May 23, 2023

the delusion of the benefits digital precision - from foundational identity to financial inclusion - ignores the root causes

I'm sitting in ID4Africa and hearing the rapid advances in deploying national digital identity across an amazing number of countries, and a lot of attempts to paint a rosy picture of the ethical, policy, legal, and societal considerations of the design of such systems.

but this seems to me to be putting the cart before the horse, or indeed, two carts before two horses.

people aren't excluded because they don't have identity - they are prevented from establishing a solid basis for id, because they are in a marginalised group.

people aren't financially marginalised because an organisation cannot do an affordably KYC on them. They're not eligable for loans because they don't earn enough, or because they take legitimate exception to the notion of debt. and they don't want to store value in money, but might prefer collective ownership of resources (like common land or barns, or rights of way, clean rivers and air). Counter example - Universal Basic Income solves a lot of financial exclusion but doesn't require digital id.

the drive does seem to be somewhat driven by the OCD nature of governments once they get their hands on computers - instead of rejoicing in diversity, everything tends to reductionism (once again)

Do you need to have an id to have the right to be educated and informed (so that you can plant rice in the right place at the right time, for example)?

the reductionism is also I think coupled with the completely incorrect notion that if you assign some unique bit pattern to distinguish an entity from another entity, that you have more knowledge (and therefore maybe power) over that entity As the prisoner (No. 6) said | am not a number, I am a free man".

Also heard someone claim that the acceleration towards global digital id was driven by the inclusivity achieved by its use during the Covid-19 pandemic - a claim made with a refreshing lack of the slightest bit of evidence.

Indeed, most of the national id systems are touted on the basis of also allowing fraud detection but note, in the UK at least, underclaiming of benefits massively outdistances fraud, and I'm guessing that's due to failure to be inclusive, whereas the fraudsters are likely sophisticated anyhow. So the goals are misaligned with the rhetoric.

Panglossian

Thursday, May 18, 2023

US and AI regulation - brief notes

why US tech bros are calling for gov regulation (or in at least one case, self regulation - but why any regulation at all

Firstly

1/ coz EU AI act

and as with DMA/DSA and GDPR ,

will have impact (on US and even on china -

has done with privacy.

n.b. UK also has a view on new regulation,

that is not that divergent from EU - a little lighter perhaps.

2/ specifically problems with training data -

halluncinations - render tools useless for safety or financial critical advice 9health, banking etc)

consent - may have used personal data without...

privacy - could threfore constitute an invasion of privact, esp.

model inversion attacks (we can extract training data from the AI!)

etc etc

3/ HuggingFace (llama etc) is free software that does most of what the GPT stuff does, but see also google re: leak doc: "we have no moat" and meta's data leak

This story recently reported in NY Times too.

The open source systems are also free & free of those problems/constraints on the data (it isn't copyright or private)

which really messes with Microsoft/OpenAI's (Google.Bard etc) business model/case.

Challenges

4/ Self reg benefits big tech

but worked badly with social media dealing with moderation/toxicity/political

interference - see proposed online harms bill in UK for example

5/ On the other hand even neutral, government or quasi

gov agencies are subject to regulatory capture :

c.f. FCC/FDA in US in comms and pharma etc

and Ofcom, ICO in UK in Telecom and data etc

positives

6/ However, the US does have one tech it largely made

and where regulation/governance is not bad at all -

that's the Internet

so not a simple story

variomatic queue management

back in the day, there was an infinitely variable gearbox mad by Daf that could smoothly vary "gearing" between engine and wheels. imagine you could do this with the serialisation time for packets, effectively? One way to do it wold be to actually change the egress port physical clock speed. Then if you had per-flow queues, legacy TCP traffic undergoing the classic saw tooth increase/decrease congestion window, could be given something that approximates to constant delay.

How could we do that in a practical way? How about network coding packets from different flows, but using Random Early Coding - this is based in the observation that flows in the higher rate part of their teeth, will have more packets arriving, so need to be coded more with other packets from the flows in the lower part of the teeth.

Indeed, the very idea here is to move from having teeth (as in cogs) to having belts (as in variomatic)...

I feel a hotnets paper in the wings

Monday, May 15, 2023

decentralising the nation state

documenting citizenship can be difficult during interesting times, and generally, historically, at any given time, somewhere on earth, people are living through just that challenge.

in speculative fiction, (e.g. snow crash) we've seen the idea of decentralized nations show up quite often.

some states have also taken extreme measures to decentralize their digital citizen data (estonia) and even replicate/distribute it over servers in other nations for safe keeping (backup, in case of invasion).

however, why don't we take this a step further into the future and the past. everyone is the result of a diaspora. at some point, humans emerged from the savannah in east africa and dispersed over the planet in multiple waves, sometimes displacing earlier waves. everyone is part of some group of displaced people from the past. How about we build on this to recognize these groups as they happen, and take the hint that nation states are not places forever (or ever)but just the collective illusion by the current occupants of some approximate locus. we could pre-empt id problems by pre-fetching copies of digital citiezenship (for example) - or "sending ahead" to care-of early "settlers" - we could have safe keeping of our personhood by people that we trust, and trust us.

many asylum seekers (e.g. in the UK, 70% of arrivals) chose to go to a particular place (not randomly) and are successful because they not only have a reason to leave some source (danger of death) but also have an affinity for the destination (family, place to live, work etc). So we should formalise this - I am sure there are informal protocols that people with the resources have (when worried about possible rogue state/pograms etc, people sew their jewelry and money into their kids clothers and send them ahead. people open bank accounts in safe places and send money there and copies of essential documentation, and so on).

Why not build a decentralised system that "just does this" - perhaps this could be based in the tech that mastodon and matrix have built (or indeed, their earlier precursor, aptly named Diaspora )...

It seems that there could be some legal barriers, and also clearly, many people would need to be told about possible rights to such a system & associated values.....that of course would be where most of the challenges would lie, not in the technology, which we already pretty much know how to build, due to the examples above..

Wednesday, May 10, 2023

decentralized learning

Some early papers emerging are switching from federated learning (FL) with a dedicated model parameter aggregation server, to decentralized learning (DL) with P2P distribution of model parameters. Both suffer from n-1 message scale challenges, at each step of the machine learning training, model parameters are sent to update everyone else, then everyone moves to the next step of the iteration on their local data.

This has multiple downsides -

1. from a performance perspective, the implosion of messages creates a severe bottleneck - this can be alleviated, through sparsification of the model parameter update cycle, either sending updates to a (say) random subset of other nodes (DL) or from a random set of nodes to the server (FL) or thinning the actual parameter list (both). A more complex distributed approach is to have a hierarchy of aggregation points - so in FL, this just means having clusters of multiple specialised servers, with subsets of FL clients, which then coordinate amongst themselves to forward partial aggregates to each other. This already smacks partly of DL, so why not just cluster a set of DL peers and elect a "supernode" (anyone remember origianl skype architecture, which did exactly this) based on CPU&bandwidth, and have it act as cluster head/aggregation point, and then coordinate with other supernodes (possibly recursively) - this could also work wide area, and take account of local versus long haul bandwidth.

2. distributed iterative synchronous algorithms have the straggler problem - there are lots of straggler elimination schemes - and the clustering in point 1 will also help with this, but even within a cluster we should have mechanisms to timeout, or even shoot-down nodes that are too slow - but note, also, we can simply run asynchronous - many ML schemes are stochastic, and will work well if run asynchronously. Or we could run a mix of synchronous for similar speed nodes, and asych updates for slower nodes (or nodes on slower links.

3. Nodes taking specialised roles may crash, so we need to run replicas. There are a ton of replication schemes (e.g. Raft) and again, we don't even need perfect consensus for training to converge, we just want to increase availability - indeed, one could build a virtual Clos topology out of supernodes for DL, and get redundant servers autoconfigured for free...

side note - supernodes also deal with NAT traversal, so nodes learning in homes (typically behind NAT and firewall) can still find other nodes. If we don't care for fixed supernodes, we can also use the BitTorrent like schemes for dynamically switching based on load, between a small set of supernodes, and, indeed, using KitTorrent's decentralised tracker (e.g. kademlia) for discovery too.

We have all the pieces. We need to build it, and they will come.

Thursday, May 04, 2023

moses' internet

there shall be no functionality that comes between people

believe what you hear, but not what you say

the values of the internet are above rubies, but the value of the internet is beneath swine

thou shalt not worship connections - stay loose, and admit nothing.

thou shalt not erect bogus names, and shalt only address the one true destiny

multi-anything is an abomination, there is but one true means of finding the one true way

corruption is everywhere. be constantly aware

do not covet they neighbours paths, they are not the true paths.

goats may not use the internet

nor shall it be easier for a rich man to use ssh, than for a camel to use telnet

the laws are written in cli, which is subject to change. make sure you have read the T&C and are familiar with all their details. If in doubt, consult one of our affordable seven lawyers.

A True History of the Internet