Tuesday, May 28, 2019

the hype of incomprehensibility

I've been looking at various techs for a few years now and watch the lifecycle  - it doesn't always involve hype - sometimes, things just seep into everyday life (the internet kind of did this over a couple of decades - even mobile phones kind of did) - so looking at things that don't make it, or have to go through some massive transformation to stand any kind of chance, one of the tells is that the tech is very badly explained, often hidden behind some simplistic banner-phrases like "blockchain" or "quantum computing" or "deep learning" - when you look at the swathes and tranches of literature, what is striking is a lack of straightforward examples.

Sometimes, this can be simply because the tech is actually rather subtle and also might involve understanding several other things first (quantum computing seems to fit in this category, Bayes methods like MCMC might be another) - other times, it is that smart people that make it their business to explain important new stuff in really straight-speaking ways (e.g. The Morning Paper ) stick to stuff that is worth explaining.

So if you see a huge pile of gray-publications about something, and there isn't anything on one of the classier blogs or oped in a leading place, be suspicious (e.g. cold fusion, brexit, DLT, etc).

Friday, May 24, 2019

data is the new snake oil

we hear a lot hot air about data is the new oil - implying there's a rush of innovation and profits as with a gold rush (there's money in them there data hills etc)-

this is so baly broken a metaphor, we need to unpick (deconstruct) it further

1. data is free to copy (nearly), i.e. data is in some sense renewable, while oil gets used up (its nearly 50% gone now).
2. using oil does as much harm (or possibly more) as good
3. using data can do harm or good
4. AI/ML is compute intensive- deep learning in particular is massively inefficient, and data centers (like power stations, in close proximity to which they are sometimes built) burn %ages of globally generated electricity - not always renewable energy
5. data can increase in value as you have more of it, up to some point (sampling more about a population of people or things)
6. privacy could be modelled as efficiency (what's relevant/pertinent and what is none-of-your-business) in space and time (why do you still want to know that out-of-date thing about me or about that?).
7. much personal data collected by cloud providers is treated as if free, though some lawyers now are starting to point out that if you have a business model based on this, it is possibly a form of payment - so while facebook/zuckerberg might claim we are the product, if this legal position is true, we are customers, and he's working for us....
8. this mission creep really implies data could be the new fur (or indeed as john naughton has said, the new tobacco)
9. the models (e.g. face recognition, recommender networks etc) are often surprisingly bad - occasional successes of GANs&deep learning are relatively rare compared with a plethora of rather shoddy systems&applications.
10. perhaps data is the new oil after all, but its rapeseed or snake oil that would be a more precise metaphor.

Wednesday, May 15, 2019

Winnie the Who & other short short stories


Winnie the Who travels around in her Time&Relative Dimensions in Pooh potty, having potty adventures with all and sundry, frequently combatting her nemesis, the Young Master Robin.

In the not too distant future, it is discovered that privacy is actually a fluid, called efluvium, that can be secreted by a genetically engineered gland. It turns out that the efluvium blockchain wasn't immutable at all, after all.

Meanwhile, AIs with untaxed imagination will from now on be clamped.

As the story wore on, he realized that time was running out, and soon, so was he

Tuesday, May 14, 2019

life cycle

this great talk at the Royal Society by Professor Mark Jackson riffed on the Fall and Rise of Reginald Perrin, as exemplifying the speaker's hypothesis, that the mid-life crisis is more socially determined, than biological - the midlife of course refers to the period between adolescence and senescence, which are also, to some extent, movable feasts.

I asked him afterwards how valid it was to apply the same notions to institutions (is the Royal Society middle aged?, is democracy senile? etc etc)...but also, whether he could look at the 3 Rises of Leonard Rossiter, the genius actor who played Perrin so perfectly  - and was previously in RIsing Damp, perhaps an adolescent drama, and the very mature Resistible Rise of Arturo Ui, a Brecht piece which could very well be re-run as a riff on Nigel Farage, right now.

Thursday, May 09, 2019

digital twinge

there's a trending meme concerning the use of digital twins, which is reminiscent of the old DARPA Total Information Awareness fantasy - in that world, every physical thing (and being) is fully instrumented and telemetry is sent/gathered by the cyber-panopticon (quite likely in the sky, like some early James Bond villain).

The problem with the vision is that it gains you nothing and costs you the earth (almost literally in terms of resource use, e.g. energy in communications, storage and computation).

The digital twin metaphor substitutes (or clones) the physical world with a digital replica.
Aside from the basic resource costs, there are also interesting challenges such as making sure the digital replica actually is a clone, and hasn't drifted out-of-synch with the physical sister - much as Hari Seldon did in second foundation recorded holograph appearances in Asimov's great original trilogy, when his much vaunted psychohistory prediction of the future had diverged  from reality due to the Mule's random mutation's intervention. And exactly as the Dr Who DVD blipverts (easter eggs) don't diverge at all in the fabulous Blink episode.

Aside from that, the reason it is pointless (as well as infeasible) is that it doesn't do what anyone wants - so you have a digital copy of everything - it doesn't abstract one whit - it doesn't let you introspect, it isn't interpretable or intelligable. One could subject it to some massive scale bayesian model inference, I suppose - but why? we have science - we have models of how physical (and social and biological) stuff works - we need only record instances and divergence from the model.

Friday, May 03, 2019

ethics theatre, but what kind of theatre?

so the internet abounds with people talking about ethics of ai. some organisations are accused of ethics washing, and others of ethics theatre. so what kind of theatre? is it soap opera (consistent with washing) or is it theatre of the absurd? is it tragic or comedic? is it a niche genre (science fiction ethics is quite a common trope) or is it a broadly popular form (romance, fantasy) or a deep, foundational literary form (tolstoyesque or janite, perhaps)?

perhaps its a fan-contributed-literature (like those popular Dr Who and Star Trek Episodes, or the posthumous novels stating James Bond or Philip Marlowe)?

I think we need a TV series to examine this meta-question.

Thursday, May 02, 2019

redecentralized data centers

what makes us think a data center at the center is more efficient than a fully decentralized cloud of personal clouds?

partly its just google, amazon, microsoft et al selling us services because their initial requirement (a place to run pagerank, or a place to run billing and fulfilment for warehouses/delivery or a place to run xbox arena stuff - having spent a lot of money building a big place to (e.g.) pull all the web pages to and build an index so that search could run fast, and then had to deal with node/link failures in the data center, replicating and adding redundancy and consensus algorithms and so on, you end up with a fairly expensive resource, which is idle a lot of the time, so you start to think about leasing off time on it for other stuff (services and customers/tenants etc) - hence cloud stuff emerges - really the only cost saving is that you have a bunch of smart sys-admin dev-ops lying around idle 23/24 of the day. so ammortize their cost over some other customers :-)

given yo ucan't just greedily walk the web at max speed, and in any case, web sites change rarely, pagerank spider/robot can adjust its rate to match expected next change - but this means you have plenty of spare cycles in any data center scanning sites where no-one is awake  - or where no-one is still up playing games, or ordering stuff - so what do you do? you virtualize your compute, and net, and elastic (statistically) multiplex it (initially prioritizing your primary business, search, games, sales, but later dedicating new resources for these uses, and then priority pricing some to give them a "premium experience" etc etc)

start from a different starting place, and why bother? all that tech for availability would work fine in the wide area, and leave the data where it was and just run your algorithms (as they run on multiple nodes in the data) on all the home machines - no data center, no wate of energy and bandwidth moving all that stuff to and fro all the time.

so could you then do search etc? of course you could. you'd gossip the index instead of the pages, you'd run p2p games, and you'd have virtual high street shops everywhere

this is not a new idea: xenoservers 20 years ago, were partly at least envisaged as enrolling home machine spare cycles for this very reason, not for virtualising nodes in an expensive, energy hungry, data center built for profit...

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home