Tuesday, November 09, 2021

10 reasons I'm not giving you my data


1 there isn't any ( I made it up)

2 I don't understand the tools I use so I have no idea how they got from data to cool graphs

3 I don't have time/money/people to work on making data usable by other people, so I'll just keep it for the next 19 papers we can dice and slice from it

4 you'll find mistakes in my paper

5 you'll find mistakes in my data

6 you'll uncover massive privacy invasion

7 you'll uncover a massive breach of ethics

8 you won't give me your data anyhow

9 I lost the data

10 the policy says I won't get any kudos/promotion if I share data

11 I can't count

Note all these apply to code as well. Note Bene, all these apply to the paper in the first place, it is just more work for your adversary (competitor) to re-do the code and data as well as the paper. So why do people share papers? If mechanisms (incentives) work for that, then same mech should work for data and code, no?

Monday, September 13, 2021

a short tale of a wearly traveller

yesterday, three of us experienced the hassles of having to produce the 5 documents & dependencies on infrastructures required to travel/fly from crete to london during period  it is held in amber status by uk, due to covid cases (sep 2021)

1/ passport, on paper (possible e-passport - needed to get through LHR in less than a day:-) 

2/ vaccination status (2 QR codes) - on smart phone may need net access to download, if you forgot to do so ahead of time (or it was deleted)

3/ covid test result from within prev 72 hour (scan or paper)- typically given on paper, so you need a camera phone or scanner

4/ passenger locator form (PLF), including reference for 

day 2 PCR test booking after return. (so you probably needed email to get that booking)

BA required these to all be uploaded before checkin & issuing 

document number

5/ boarding pass (on smartphone or paper), needing access to

printer, scanner or camera phone & network. - I guess they might have accepted them all on paper at the checkin desk...

2 days after return, get PCR test and a day later get results (negative, unsurprisingly). However, a day after that, the three of us that travelled together all get pinged by the NHS Test&Trace to say we have been in contact with someone and now have to get another test - this is stupid, because we had not been together since landing back in the UK, so the only point of contact must have been another traveller on the plane. so we have all gotten negative PCR since then, and the NHS know this, so why ask us for another test? fail to join up thinking. (I'd say it was a race condition, but I've never heard of one that lasted over 24 hours).

The only way we were on the NHS records was via the passenger locator form (since they used both mobile & email to contact the three of us and that's the only place they would get that info) which has the booking info for the day 2 returnee PCR test on it. 

Wednesday, August 25, 2021

on trusting trust and the shadows on the wall of the cave

Reading  the excellent Your Computer Is On Fire recently, and there's a great chapter revisiting Ken Thompson's rightly famous Turing Award Speech about trusting trust. The chapter also discusses the Wheeler solution to the problem --

in a nutshell,  when you use a tool chain for building a computing system, you depend on the tool builders. So an application must be compiled (or interpreted) and runs on an operating system, which runs on hardware which may be networked and so on - it is "turtles all the way down".  The Thompson "hack" takes advantage of two things - bootstrapping compilers and quotation, to build systems that build in trapdoors at build time, but in a way that is not visible to simple inspection of the compiler tools (without going back in time to before the hack and before the bootstrap - i.e. introduces a cost of effectively rebuilding your tools ab initio every time to avoid the trapdoor re-insertion two step dance. The Wheeler solution is to find some tools from elsewhere as well and compile your system with those too and compare the results. An alternative is to use trustworthy computing so that the privileges don't increase as you go down the stack, and you can check the integrity of the tools&die as well as apps - but now with attestation, or with multiple toolchains, we have a chain or even a web of trust rather than a stack of trust. We may need a web (or even a blockchain) because we want to mitigate collusion (between key signing agents or between different tool builders, or, obviously both) - 

Isn't life complicated...?

Tuesday, August 10, 2021

decentralisation & disintermediation

 thinking about the history of peer-to-peer (IP routers, eDonkey, the original skype, and now new things like matrix,  mastodon etc) - there are several properties oft conflated together 

1. distributed - in your pocket, kitchen, on your bike, etc

2. decentralised - there's no agency, service with a single point of contact, failure, power, etc

3. heterogeneous - and partially federated - implemented by different people, but interoperating 

what this also means is that there's no big intermediary - no single platform owner, who has a god's eye view of the proceedings - marketing things or surveilling things. - there could still be such services, but they would need cooperation from all the targets they'd want to hit or spy on.

what is wrong with Uber, Airbnb and (probably) Bitcoin is that while they have some of these properties, they are dependent on single large infrastructures (roads/gas, houses/keys and the electricity grid) - you can build a fully peer-to-peer map of the world and let everyone share their EVs, and you could move all property into collective ownership (gasp), and you can build a decentralised trust system that doesn't depend on proof-of-work, but without that, these systems are fundamentally intermediated by those key infrastructure owners, who could change the operating rules to make what is done infeasible, or just pwn it. i..e their governance is extremely sketchy.

Thursday, July 15, 2021

The internet is made of holes

 This Atlantic article by Zittrain suggests that the internet is decaying. I think this is a classic observation error - the internet is like a kids plastic inflatable garden pool that is being blown up bigger and bigger and filled with water the whole time to overflowing - sure, lots of spillage, but also more and more content. and this isn't just a quantitative observation - more and more of the content is curated in various ways  the problem is that exponential grown brings both more quality content, but even more (in just pure numbers of, say, pages or photos or ditties) junk in the long tail, which isn't being looked after (think of all the social media content that dissappears when people grow up and delete their (last year's most popular service) account.

sure, the internet is full of holes. that is why the content was organised as a web  - the clue is in the name:-)

If "important" stuff is disappearing permanently, often, I think someone would do something about it, and they are...

Monday, May 24, 2021

Photo Id for Voting in the UK

 There are about 3.5M people of voting age in the UK who dont have photo id.

May cannot afford a passportt and don't drive so won't get a driving license.

so government proposals to require photo id for voting is 

a) unfair on them as the hassle of getting some other voter id will deter some from voting, and is cleatly motivated around which segment of society they are likely from, politically.

b) Plan B is to have local councils generate free, or very cheap, photo id for those people to get on demand. Not a great plan, since such Id will then become a target for fake id (as it is in the USA).

This will increase voter fraud (which currently in the UK runs at about 1 case per election). at a cost of about £20M per annum. brilliantly counter productive.

but also (as experience in NI and aforesaid USA shows) will also be used for age verification, and even ID checks for people making payments, hence increasing fraud there, massively. 

Ironically, something the Online Harms bill shoud really be addressing - another piece of pointless

government legislation ust to be seen to be doing "something" for a problem that exists in another country, but not here. doh.

what's in an NHS App QR code that vouches for your vaccine status?


If you've got the NHS app (the one you use for booking appointments, or repeat prescriptions, not the contact tracer one), you can download a vaccine/covid status to it - here's mine, decoded

on it, you see my name & dob and the vaccine dose name, batch number and date, plus it is signed, and can be checked for its legitimacy - there's international protocols (at least for EU, and the UK Is still cooerating on that). If you dont have a phone capable of running the app, you can get a letter from your GP (takes a few days) - not too much data being given away here- you don't need to show the vaccine status being downloaded, you can store it (or get it emailed)and a border person could check it with (presumably) some other app and check name/dob against passport.

the code is valid for 1 month - i.e. it expires, so you then just download (or get emailed) a new one - so long as the vaccine wasn't so long ago that it's efficacy has dimmed (and we dont know how long that is yet for all the vaccines in use) you should just get a new valid QR code or cert (or letter) for another month...

not a lot of privacy threat here....nor is it a huge burden on systems to run something like this...

ref: https://paravirtualization.blogspot.com/2021/05/whats-in-nhs-app-qr-code-that-vouches.html

trust framework: https://ec.europa.eu/health/sites/default/files/ehealth/docs/trust-framework_interoperability_certificates


<COSE_Sign1: [{'Algorithm': 'Es256', 'KID': b'Key5PRO'}, {}, b'\xa4\x01bGB' ... (350 B), b'\xd1zo\xb3\x1b' ... (64 B)]>
    "-260": {
      "1": {
        "dob": "xxxxxxxxxx",
        "nam": {
          "fn": "Crowcroft",
          "fnt": "CROWCROFT",
          "gn": "Jonathan",
          "gnt": "JONATHAN"
        "v": [
            "ci": "",
            "co": "GB",
            "dn": "1",
            "dt": "2021-02-11",
            "is": "NHS Digital",
            "lot": "EL7834",
            "ma": "ORG-100030215",
            "mp": "EU/1/20/1528",
            "sd": "2",
            "tg": "840539006",
            "vp": "1119349007"
            "ci": "",
            "co": "GB",
            "dn": "2",
            "dt": "2021-04-09",
            "is": "NHS Digital",
            "lot": "ER1749",
            "ma": "ORG-100030215",
            "mp": "EU/1/20/1528",
            "sd": "2",
            "tg": "840539006",
            "vp": "1119349007"
        "ver": "1.0.0"
    "1": "GB",
    "4": 1624147200,
    "6": 1621341834


import sys

import zlib

from base45 import b45decode

from cose.messages import CoseMessage

import cbor2

import json

qr = input("QR plz: ")


if qr.startswith('HC1'):

              qr = qr[3:]

              if qr.startswith(':'):

                  qr = qr[1:]

bin = b45decode(qr)


foo = zlib.decompress(bin)


bar = CoseMessage.decode(foo)


baz = bar.payload

baz = cbor2.loads(baz)

fee = json.dumps(baz, indent=4, sort_keys=True)



reminder of value of contact tracing:-


but also of risks:-


Friday, May 14, 2021

Proof of Green

 so rather than burn the earth even faster in some bogus pursuit of decentralized crypto-currencies (we only have one earth, so bitcoin is inherently centralised around that one fact), why not use renewable resources to generate coins. I don't mean greenwashing where you place your mints next to hydroelectric or geothermal sources. I mean literally use the fact that sources like solar are highly time&space varying - a large solar array could be used to generate signatures (each cell will receive slightly differnt amounts of sunlight over time - the voltage generated from each, therefore varying - this can be logged (e.g. on a blockchain) with GPS coordinates (now feasible down to centimeter accuracy courtesy of new devices), and acts as a unique coin value. This can be measured and verified by other parties. It costs almost nothing to mint, and is a side effect of building more renewable (solar) energy sources, rather than a pointless consumer of them.

see the light! 

Friday, April 30, 2021

mutable biometric auth - the really useful MBA

 o here's a thought. 

we now have reliable and safe mRNA for people. 

how about we use mRNA to teach our cells how to generate protein keys (key pairs) for crypto. We then have chip based readers that can check to see who we are (and we can build secure protocols for doing this that avoid obvious replay attacks etc), but without committing to using your actual DNA (or other biometrics) which, once compromised, cannot be changed.

chips that decode proteins are around - all the pieces are there. 

also, you don't get locked in to one provider (there are lots of people doing mRNA stuff, and we could even open source the mRNA system)....

seems like the way to go - 

before anyone tries to patent it:-)

Wednesday, March 24, 2021

Why not look at Augmented Human Intelligence, ahead of Artificial General Intelligence?

 As part of the Turing's AI UK Conference I was thinking about where we should be in 5,10,30 years

I'd like to see if we can reverse Frank Zappa's observation about scientists' incorrect belief that Hydrogen is the most abundant substance in the Universe, and rather, it is far exceeded by Human Stupidty.

Given peoples' blatant lack of discernment in social media, voting, and generally outrageously dumb collective behaviour, e.g. in the face of existential threats like climate and nuclear weapons, this seems like an urgent matter. and building AI to mimic humans seems, at this point, like a bit of a seriously losing proposition.

So how could we use AI to augment human intelligence? The trick is not to democratise the writing of black-box AI (giving people visual programming languages for convolutional Neural Networks is an even worse idea than increasing the world's population of buggy C, Python coders.

The idea is to make existing work on AI legible. Not just explainable, but teachable. so when making a decision, an augmented human might use an AI method, and at the end, not just no why it recommended what it did, and not only why, but how to internalise the knowledge and skill to use that method herself.

This is akin to the idea of the mentat characters in the novel, Dune. Humans carry out computational tasks, and computers have long since been banned after the fictional Butlerian Jihad, on the basis that they are unethical. In my view, that is somewhat of a limited view - we need to retain the AIs, but they become mentors.

To this end, we need to concentrate on AI tools and techniques that are intelligible not just explainable. So while simple ML tools like regression and random forests are ok, you also need tools like generalised PCA and probablistic programming systems, and Bayesian inferencing that clarifies confounders, and, if  we must go on using neural nets, at least SHAP, path-specific counterfactual reasoning and energy landscapes, to illustrate the reason for relationship between inputs and outputs. GANs fit here fine too Ultimately all these systems should really be a pair - a model, that is self-explanatory (e.g. physics, engineering, biological cause/effect) coupled with the statistical system that embeds the empirical validation of that model, and, possibly a hybrid of symbolic execution and data-driven systems. Of course, people in guru/hacker mode writing the next gen AI need to document their processes, including their values, as this is all part of making the results teachable/legible/learnable too.

In the end, these systems will also likely be vastly more efficient (green cred), but also intellectually, will contribute to human knowledge by exporting the generalisable models they uncover and make more precise, and allow humans to individually, and collectively, stop behaving like a bunch eejits.

Then we can let the AIs all wither away, as we won't need them any more.

Tuesday, March 09, 2021

The Genies that probably won't go back in the Bottle

 One discovery made about people in organisations using video conferencing was in the early days of the Defense Simulation Internet - this was about 30 years back (DSINet started around 91) and made extensive use of the Mbone technology to provide many-to-many real time video, audio and shared applications. One of the UIs for this had a prototype of the "hollywood squares" that many Zoom users will nowadays be familiar with, 

Most of the real users of this system were wargaming (the shared apps included highly detailed battlefield maps with animations of army vehicles etc). At some point, the generals got really upset because they noticed the rank-and-file were talking directly to each other, rather than up-and-down the chains of command. Students of history will know that such a peer-to-peer organisation was also how the anarchist brigades operated in the Spanish civil war - it is highly effective as it is highly resilient (there's no leader to decapitate, and it is lower latency to get information to the people who need it to make decisions and take action).

This all applies to any overly hierarchical organisation, be it university, company or indeed, entire nation states. We cut out those annoying pointless "leaders" who make the wrong decisions because they are a bottleneck, and swamped with either too much advice, or too many filters, or too many lobbyists distorting the information, The Internet may finally actually democratise socieity, but not as previously envisaged.

By the same lockdown token, people have more time to consider content delivered by digital communication. Consideration may lead to more nuanced decision making (e.g. not responding to clickbait, or believing fake news, or even taking care to remember who was responsible for these things and mentally marking their future utterances as suspect, or at least "to be fact checked carefully when I have time after this".

Evidence for the increasing discernment by the broad public can also be seen in the search for relatively subtle explanations of what is happening (rules for lockdown, vaccine safety etc) - where people would dismiss experts, they now much choose an expert who explains about exponential increases in cases when R0 is above one, or the nature of false positives and false negatives in different tests. This is because after a year of hearing experts and politicians, it is increasingly obvious whose explanations and predictions are based in some sort of discipline, and whose are just self-serving attempts to maintain a wobbly power base. 

You can fool some of the people some of the time, but 12 months in, everyone starts to realise who the real fools are. Or indeed, crooks.

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home