Thursday, March 30, 2023

moratorium bibliotecha

 It is time for writers and publishers to pause in their ceaseless efforts to produce more and more books.

The Large Library Machine has reached 14M texts, each of which contains up to 100,000 words, all usually carefully structured in such a way as to convey many truths, much wisdom, but also so many ideas that are just plain wrong. But how is the poor reader supposed to separate the wheat from the chaff? Even the index would take a lifetime to read, and that too has gaps and repetitions.

There are so many of these books that the business of organising them has taken over from tasks like farming, tailoring, doctoring, house building and other essential tasks, such that the population is going unfed, in rags, and constantly sniffling, from the constant breezes blowing through the gaps in the walls.

But no-one can find the right instructions for when to sew, when to plant, when to harvest, what could be used to treat the common cold, or re-plaster the walls.


No, we must call a temporary end to the constant creation of more and more written material, until we have had time to have a fair and open debate on how to accommodate this scourge without the damage to the social fabric that it has caused.


We need metrics. For example, we have no idea how to tell if a particular book will fit in a particular person's head. Nor do we know what size of library would overwhelm any person in the three score years and ten of their natural lifetime, or what would be a delicate sufficiency (assuming occasional re-reading of timeless classics) for most of us. Time to temporarily suspend the generation of new tomes. No more novels for, shall we say, six months. After all, we know that there are only 9 plots and 3 kinds of hero, with a handful of possible counterfactual realities, and merely 4 theme tunes in the televised adaptation.

Indeed, we could probably use the time to thin the libraries down to those works that have proved their worth. The rest could be used to stuff the gaps n the walls, or to weave winter warming underwear, or even to blow one's nose.


databox 2.0 - the manifesto

 databox and hat were two personal data store projects (sometimes "edge cloud", sometimes "self soverign data") that were trying to solve the practical and business challenges perceived nearly a decade ago, also recognized in the web community in their solid architecture.

while all these initiatives were somewhat sucessful in their own terms, what they've not managed to do is actually create a platform that others (e.g. mastodon or matrix) could "just" use.

Why not? What's missing?

Manageability. Compared with the cloud (centralised?), edge systems lack some properties which are really non-optional for storage and compute services today (30th March 2023)...


This list is a start on what data centers and cloud computing achieved, but edge has not really delivered platforms for:- Centralised systems provide these using techniques that could be re-applied at the edge, but one of the key missing pieces is what makes cloud "scale out", which is amortising the cost of providing these mechanisms over many many users. In the edge case, you actually have precisely the opposite, which is that you have to increase the cost of participation as you add users, as opposed to (sorry:-) exploiting Metcalfe's law of net scaling...that the value of the net increases super linearly with the addition of each "user", since they can also offer a "service" - i.e. super-additive value, rather than just adding a burden on the overall community. Lets be specific - this isn't about content, this is about compute and storage demand - of course, a decentralised or federated system could still exploit users content, although that is counter-indicated by most peoples' motives for using such systems. But the fabric has to provide:

  • availability - access networks and edge devices fail. so do core and metro links, and system units and racks in data centers, but much less frequently. To mask failures typically involves adding redundancy - e.g. multihoming systems, and running replicas. Replicas need to be synchronised, typically via some consensus scheme (or CRDT) - in the decentralised case, this is harder because failures are more complex and number of replicas may need to be higher (have seen estimates that you need about 6 fold to be as good as a cloud provider, who might use at most 3-fold).
  • trustworthiness - a cloud provider makes a deal with you and can be trusted or else goes out of business. Your peer-group in an edge cloud may well include bad actors. so now you need byzantine fault tolerance, which is a lot harder than just having majority consensus.
  • persistence - cloud providers have mostly shown that they are not "fly by night" (or the ones that are have gone bust). Personal devices come & go - I think I've seen estimates that people change smart phone about every 2-3 years (many contracts are designed around that) - while newer devices have more storage, typically they arrive with very simple cloud based synch with old devices (though android&IoS can do device-to-device synch of course) but also the users have tablets, laptops and workplace systems that also all need synching and expect long term survival of their content (as well as any ongoing processing) - e.g. personal content like email/messaging, photos/videos, even music, films, games etc) - cloud based systems have amortized costs of backup/replicas, and indeed, deployment of new storage tech at a wholly different scale from the edge. New technologies (synthetic DNA, silica/glass storage) are already being looked at for the next steps in that context but are years away from being affordable/usable for edge devices (if ever, given their slow write times....do edge users have the patience or personal persistence to take care of that?).
  • capacity - while newer devices have impressive storage, that just makes things worse, since one moves from just taking photos on the phone, to HD video and on to augmented reality and so on.
There are other no doubt other things one could add (complexity of key management not the least, although recent thinking in that area might provide some solutions) but for me this needs a concerted effort to re-think the job - If I only depend on other users' edge devices, then how to I find the 5 other users and what is in it for them, to replicate my services so that I can continue to see them wherever and whenever I need? How do we deal with continuous upgrades over vast numbers of disparate end users' systems, rather than vast numbers of systems in a small number of administrations? What are the ways one can seek redress in such a world, for data loss or breach of confidentiality. And on and on...

So one approach that is promising is what has happened with Mastodon, which is only "somewhat decentralised" in that each instance serves potentially quite a few people, and indeed, may not be running on the Raspberry Pi in someone's attic, but on a cloud infrastructural service. The key point is who controls access to applications&content, not where  the bare metal compute is located. At least for now. Indeed, even if one distributed the computers, one is still largely still dependant on centralised electricity generation (ok, so I have solar, but how many people do...?).

So decentralised keys, but somewhat centralised, but migratable services looks pretty ok to me...to be getting on with.


There's a temptation to claim that a centralised system can emulate the federated/decentralised properties concerning control/ownership/access by encrypting data in store, transmission, and processing (enclaves or homomorphic encryption or secure multiparty computation) hence partitioning the data by key/function management, rather than spatial/device ownership. While this might work in terms of data use it doesn't deal with several centralised system failure models - including (non exhaustive list) denial of service (deletion intentionally or by central organisation failure/business exit, change of T&C etc); small gene pool of software hence likely vulnerability that strikes all customers of the central service at once (even though data breach might not happen, but it might too, whereas the cost of such an attack on a large heterogeneous system might be far less likely to impact as many users; However, on the other side of that coin, if I want decentralised systems that use replication that itself is heterogeneous, I need to have the copies run on other peoples' systems, who might not be any more trustworthy than the cloud/central providers....so one might still want enclave, and homomorphic crypto on the social net friends replica systems, but the crypto then mitigates against some of the energy savings of being at the "edge". Indeed, edge to edge communications might be very much more complex than edge<->central, which the network has been optimised for. Replica consistency in the edge case, is a relatively new thing, and consensus protocols havn't been stress tested well any where as much as necessary to say they'd be as good as the simpler designed cloud/data center replication environment's use of algorithms like Raft etc...

So it is not quite so simple to compare decentralised high availability systems with centralised high confidentiality systems.


Monday, March 27, 2023

the seven laws of robotherm

 Laws of Robots v Thermodynamics 

with apologies to wikipedia.


1 A robot may not injure a human being or, through inaction, allow a human being to come to harm.  


1.  the total energy of an isolated system is constant; energy can be transformed from one form to another, but can be neither created nor destroyed.


Add human, and make it conservation of life


2 A robot must obey orders given it by human beings except where such orders would conflict with the First Law. 


2. When two initially isolated systems in separate but nearby regions of space, each in thermodynamic equilibrium with itself but not necessarily with each other, are then allowed to interact, they will eventually reach a mutual thermodynamic equilibrium. 


Add human, and make it irreversability 

(aka associative, commutative, etc)


3 A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.


3. A system's entropy approaches a constant value as its temperature approaches absolute zero.


Add humans, make it equilibrium or perhaps, fairness



0 “A robot may not harm humanity, or, by inaction, allow humanity to come to harm.”


0 If two systems are both in thermal equilibrium with a third system, then they are in thermal equilibrium with each other.


Add human, and make it a scale-freeness property

(society is an ensemble of individuals)

Tuesday, March 21, 2023

federation, for interoperability, needs to be decentralized.

 We''re starting to see how open interoperability requirements might be a constructive way forward to reducing the harms potentially caused by oligopolies in the tech sector. This especially in EU, is one response to the DMA, and seems perhaps less painful than enforcing break up of big, successful, and sometimes innovative companies or their services.


However, if proposed platforms for interoperation are centralised, and they are in the least bit successful, e.g. for example in the domain of secure, group messaging, as in this IETF work, then those platforms  will need to scale to the same size as a significant subset of users of the systems they interconnect, and that begs the question, "what is to stop them just becoming another one of the big few" or indeed, replacing several of them, leading to an even smaller gene pool. If they can solve the security and UX challenges in bridging all the services (and their meta-protocols/data like key management/onboarding, mutual trust management, spam/abuse management, etc etc), then they are as good or perhaps better than the sum of all their parts.

We can prevent this trend by simply designing the interoperability platform as a pure federation service, which is infrastructureless - it can run in clients only, or it can run as a classic P2P or decentralized service piggybacking on relatively inexpensive servers that many users have already deployed (e.g. use of raspberry pi for mastodon servers is already a thing, but given the separation of concerns, there's no reason not to run decentralised  application protocol service on a (choice of ) centralised cloud providers (infrastructure-as-a-service providers, that is, not AAAS).

We have built another example of such a thing for digital identity called  trustchain, as a proof-of-concept. 

Saturday, January 28, 2023

FaceBookem Danno

Here's a thought - why don't face recognition systems not only find a match in their database but like many recommender systems, find nearby matches (netflix recommends me movies and says "because you watched this and liked it") but also, without being privacy invasive, one could just use synthetic faces and the error (precision/recall) in the face recognition network, to show the range of kinds of faces that would also match - e.g. can just use  a GAN like this fake face generator site does:



Note propagating error information is also already the basis for all those fancy stable diffusion image things like Dall-E 


A nice explainer here, for example: so should be very easy.


A neat thing to do with this would be to virtualise "identity parades". - instead of picking some likely looking suspects off the street, one could generate a set of faces (or figures or even videos)....one thing this would also do would be to underline how bad humans are at identifying people from a remembered incident - indeed, one could do some interesting cognition/perception/memory/bias research on the differences between poorly trained AIs, and poor old people.


We could call this "facebookem" perhaps :-)


A nice demo would be a "dark mirror" app, that would run on your camera phone, and pick at random a synthetic face that would pass for you in a line-up.


This paper on face replacement does a similar job, but uses different faces, rather than ones drawn from a GAN trained on the face we're trying to undetect.


Thursday, January 26, 2023

re-decentralization - the power struggle

 there are some great scalable decentralised systems now showing real deployment, like mastodon's decentralised open source social network and matrix open secure messaging platform (just my fave, add yours!!),  But they still depend on electricity and internet access. 

we have some community internet services such as Guifi open community mesh net- again, just my fave, but there are many,  They all illustrate the care needed with governance models...

so what about that power struggle? well I just put solar on my roof, but that only gets me half the year (actually I sell to the grid for 6 months, but we're too far north and cloudy to really run in the depths of winter) - but there are community projects to put up larger scale locally owned solar (and wind) and also in the fairly near term too, ground source heat pumps, run/owned collectively (otherwise capital cost is a bit of a hill to climb) - this project I just invested in, in my local area, is an interesting example

power up north london sidesteps the need for central generation, and government or private sector, except that we still need (albeit a re-tweaked) grid distribution of power...that can come next..

Arguments for this are a) it can scale up fast b) it provides resilience (against variation in weather, geo-politics, price, you name it) and is sustainable.

Arguments against this are not a lot as far as I can see - by vesting in community ownership, you commit people to the maintenance of the system, which in any case is far less onerous than a) current central service bills and b) the capital cost of deploying these systems in the first phase. Like collectively owned barns in the past, and cooperative savings&loans not-for-profits, we can structure many of modern worlds infrastructures in new (but actually old) ways...

Tuesday, January 24, 2023

embrace inconsistency...eventually

 a lot of distributed systems folks get a bee in their bonnet about eventual consistency, and device ever more baroque solutions to the problem of byzantine fault tolerance (and some lovely work, don't get me wrong - the Byzantine-Altruistic-Rational-Tolerant work is an awesome synergy of so many cool thoughts, and is useful)...

however, a lot of systems are never actually consistent, but are still useful. two examples to illustrate

1. the world wide web - people berated Berners-Lee when he devised uni-directional URLs, as leading to inconsistency - dangling references abound, but the web (including the spiders&robots that index it for search engines or LLMs) work. You just crawl what is there and leave out what is not currently reachable...you can devise all sorts of heuristics based on stats of what you find over time, to device when to revisit...

2. routing - vectoring algorithms (RIP, BGP) are possibly eventually consistency, but the convergence time often exceeds the mean time between failure/repair, so that the system of routes never converges. Vectors stored at different nodes in the net point in directions no-longer "globally" consistent - however, forwarding along those paths may encounter changes of direction as packets encounter more up-to-date routers and get their progress "put to rights" - of course, one problem this can lead to is transient loops - these are eliminated in many protocols (BGP has one trick, there are plenty of others). So then how "bad" are the temporarily divergent routes, and what do we save in routing protocol/computation overheads by not worrying about this, if many packets get to the destination, albeit via the scenic route, rather than the "best" by some metric?

We don't really know. But we should, as we could actually design algorithms that work in the presence of inconsistency and have some sort of predictable performance across some range of topologies and failure/repair statistics. 

Sure, we can also devise algorithms that attempt to maintain global consistency (c.f. link-state routing) but these may have scale limits. In general, I feel that some sort of thinking about diffusing computations intermediate states' trajectories and consequential effective actions seems warranted. There has been some work (I know one of Tim Griffin's students had some neat ideas in this space) but it might need a new approach, which might yield value elsewhere - this is not quite as clean as (say) CRDTs which divide and conquer the consistency problem, but some similar kind of thinking might apply.

The 3 Robots at Law

1. a robot lawyer shall be interpretable...

2. a robot lawyer shall not be subject to infinite recursion attacks; 

3. a robot lawyer shall not reprogram the adversary's robot lawyer to lose; 


4. emergent 4th law: robot lawyers shall only have 3 wishes, and only 1 will be granted, and that one at random.

Wednesday, January 18, 2023

ballet notation and video conferencing. back in 1992...

 I wrote this - please excuse the formatting (can't find a version of the UCL specific nroff MM macros)...


use of Labanotation to automate the annotation of video conferencing, something I heard about in the context of AI last week at MPI (31 years later)....and stable diffusion

1m"Some Experience of co‐authoring with synchronous and0m
	      1masynchronous computer mediated communication"0m
			       4mJ.24m 4mCrowcroft0m
			       4mM.24m 4md’Inverno0m
		      Department of Computer Science


				 4mABSTRACT0m


       UCL has	a  digital  multi‐media	 conferencing  system  that
       connects	 us  to	 8 sites in the US and one site in Germany,
       over the Internet. It allows up to 4  way  video/audio,	and
       all the normal data communications facilities. These include
       Electronic Mail and BBoards as well as a	 shared	 multimedia
       document editor called mmconf/Slate. The conferencing system
       is described in some detail in [1] and [2].

       This paper is about  specific  experiences  using  this	and
       related	systems.   For	comparison,  a	more  old‐fashioned
       asynchronous facilities to co‐author a document	is  briefly
       described.

       An approach is also proposed to help with the description of
       the visual side of human computer mediated communication.

       It should be stated that most of the use of the	system	has
       been  self‐referential;	although  there	 have  been "naive"
       users  (UCL  and	 MIT  librarians),  their  usage  has  been
       restricted  to synchronous meetings. This paper is anecdotal
       rather than analytical.





       1m1.  Introduction0m

       First let us the terms in the title:

	  • Co‐authoring

	    Two or more people write a	text  together.	 There	are
	    several   interesting   points   on	  the  spectrum	 of
	    distribution of labour in co‐authoring  which  interact
	    with  the  physical	 distribution  of  the authors ‐ we
	    comment on these in the conclusions.

	  • Synchronous1

       ____________________

       1. Some	use  the  term	isochronous  ‐	 this	is   wrong.
	  Isochronous  networks	 maintain  clock synchronisation on
	  bits at all points in the network







				   ‐ 2 ‐



	    Synchronous communication is soft real  time.  In  long
	    haul  networks  such  as  the  Internet, there may be a
	    delay between sending video/audio/data and receiving it
	    (Einstein  is insurmountable) but it is bounded to some
	    reasonable number (because	of  the	 video	compression
	    technique we use, it can be quite high ‐ as much as 400
	    msecs,  or	even  2‐3  seconds  under   specially	bad
	    conditions!).

	  • Asynchronous

	    By	this  we  mean	use  of	 electronic  mail  and file
	    transfer/access out of  band  from	direct	voice/video
	    communication between people.

	  • Computer‐mediated

	    This  is  harder to define. The network that carries E‐
	    mail, files and video/voice is made of a set  of  links
	    connected  by nodes which are special purpose computers
	    (switches/routers). For E‐mail (especially	multi‐media
	    mail)  the	end  system  that the human uses is general
	    purpose  computer/workstation,  which  can	apply  many
	    processing	techniques  to the data. However, for Video
	    and Voice, only very special purpose processing is done
	    to	maximise  the quality (the 4mform24m). A great detail of
	    research is needed before more  interesting	 processing
	    could  be  done  on what is said and how during a video
	    conference (4mthe24m 4mcontent24m).

	    However, material from off‐line  is	 presented  on	the
	    same  screens in our system as the video, so a document
	    being discussed is juxtaposed to the speakers.

	  • Webster defines communication as:
	    com.mu.ni.ca.tion ‐.my:‐n*‐’ka‐‐sh*n n 1: an act or instance of
	       transmitting 2a: information communicated 2b: a verbal or written message
	       3: an exchange of information pl	 4a: a system (as of telephones) for
	       communicating 4b: a system of routes for moving troops, supplies, and
	       vehicles 4c: personnel engaged in communicating 5: a process by which
	       meanings are exchanged between individuals through a common system of
	       symbols pl but sing or pl in constr  6a: a technique for expressing ideas
	       effectively in speech or writing or through the arts 6b: the technology of
	       the transmission of information

       The rest of this note is into three main sections:

	 1.  A brief description of a fairly successful	 production
	     of a document through asynchronous communication. This
	     is used to set the	 parameters  for  what	may  be	 of
	     interest in synchronous collaboration.

	 2.  Details  of  two  major meeting and possible documents
	     produced from them.








				   ‐ 3 ‐



	 3.  An approach to help with the description of the visual
	     side   of	human  computer	 mediated  ommunication	 is
	     described. This may also be  used	to  help  prescribe
	     what  visual  channels for communication are used when
	     for systems which have limited  bandwidth	and  cannot
	     allow  fully  cross  connected  video  and audio (Many
	     systems have full cross connected audio  with  speaker
	     video only for bandwidth reasons).
       Finally we present some conclusions.


       1m2.  Asynchronous Collaborative Experience0m

       A  paper	 was  written  by  4  authors,	2  in  London, 1 in
       Cambridge and one in Colorado. Two of the  authors  had	not
       met face to face. The paper involved a repeatable experiment
       and performance analysis. It took about 1 week elapsed  time
       to  write  and will be published in a respectable (refereed)
       journal [4].

       It evolved as follows.

	 i.  Author in US E‐mails author in London  with  anomalous
	     output from an experiment.

	ii.  Author  in	 London	 E‐mails  2nd author in London with
	     problem statement.

       iii.  2nd author in  London  re‐implements  experiment,	and
	     discovers	same results. 1st author in London outlines
	     components in the system that  could  be  causing	the
	     problem.

	iv.  Author in Colorado writes script to run experiment and
	     graph output, and sends to London.

	 v.  2nd Author in London hypothesises problem explanation,
	     and a possible solution.

	vi.  Authors  in  London  E‐mail  author  in  Cambridge and
	     Colorado with problem and hypothetical solution.

       vii.  Cambridge author E‐mails colleagues  for  availability
	     of code to implement solution.

       viii. Meanwhile, draft paper is E‐mailed to interest list.

	ix.  Member  of interest list is editor of journal, asks if
	     we want to submit it ‐ we do.

	 x.  Referees comments sent (anonymously)  by  E‐mail  that
	     they want to see implementation of solution.










				   ‐ 4 ‐



	xi.  None   of	 authors  have	source	code  to  implement
	     solution, so they ask interest‐list.

       xii.  Colleague in Sweden on  interest  list  changes  code,
	     compiles  and sends new (unix kernel) object to London
	     (by E‐mail).

       xiii. Authors in London run script by  author  in  colorado,
	     get   correct   results.	These  are  (automatically)
	     included in the final paper which is E‐mailed  to	the
	     editor of the journal.

       This  sequence  of  events  hides  two  extremely  important
       factors in the success of the collaboration:

	 1.  Commonality of interest/experience.

	     The  authors  came	  from	 very	similar	  technical
	     backgrounds   ‐   (by   coincidence,  all	trained	 in
	     engineering followed by computing).

	 2.  Independance of tools.

	     The authors were all  able	 to  use  tools	 they  were
	     familiar	with   for  mail  processing,  editing	and
	     document preparation as well as actually  running	the
	     experiment.

       It  also	 reveals  two  of  the	major  advantages  of using
       asynchronous communication:

	 1.  Authors can work simultaneously rather than having	 to
	     gain  some	 token	as  floor  chair  or editor‐of‐the‐
	     moment.

	 2.  Synchronisation of versions of a document is  achieved
	     at	 the same time as communication of a version ‐ i.e.
	     when any one author posts a new  version,	most  users
	     see this at roughly the same time.

       These  advantages  are  at  some	 expense ‐ authors may work
       simultaneously to the same effect, and replicate each others
       effort.	Authors	 may  diverge, and have to merge subsequent
       versions	 of the work somehow.2 However, replicated work has
       scientific value and divergence may may well be	productive.
       The  reduction of waiting time to gain the lock on a version

       ____________________

       2. The use of a revision control system was obviated by	the
	  documents  arriving  as  e‐mail,  and	 therefore having a
	  unique source/author and timestamp/message id. Merging in
	  many	 RCS/SCCS   type   systems  makes  use	of  context
	  difference systems which are available  in  any  case	 as
	  part of the normal operating system tools.







				   ‐ 5 ‐



       was certainly advantageous (especially given different  time
       zones,  when  finding the person who had the lock might take
       1/2 a day).


       1m3.  Synchronous Meetings and Minutes0m

       In this section we describe two meetings that were  held	 in
       rather  different  ways	using  the multi‐media conferencing
       system, and report some of the users’ experiences.

       4m3.124m	 4mThe24m 4mIETF24m 4mDirectory24m 4mWorking24m 4mGroup0m

       The first is a 4 site video conference which made use of the
       UCL  to US system to hold a research group meeting of around
       35 participants. The meeting was	 held  in  rooms  that	are
       designed	 like  small  studios  with  a small audience and a
       small number of podium  speakers,  with	cameras	 switchable
       from audience to podium or individual speaker. The resulting
       document from this meeting was a report of the meeting which
       are available as a UCL technical report.

       Workstations were available at each site during the meeting.
       Normal editing/document preparation tools were available	 as
       well as normal electronic mail and file transfer.

       Groups  like  this  usually   chose  not	 to  use the shared
       editing system for several reasons:

	 1.  Lack of familiarity ‐ mmconf/Slate	 is  a	hybrid	GUI
	     WYSYWIS  system  ‐	 partly like Apple MAC, partly like
	     Sun Open Look and partly X Windows.

	 2.  Size of the meeting and lack of screen real estate	 to
	     show  the	documents to all sites as well as video. It
	     is felt (whether correctly or not we cannot say)  that
	     the video is more important.

	 3.  This  particular desired independent minutes from each
	     site ‐ this did not require shared	 editing.  If  they
	     had  used	shared editing, they may have felt that the
	     meeting  dynamics	were  undermined   by	the   floor
	     exchange/token exchange delays.

       3.1.1  Pre‐meeting  E‐Mail   Setting  up	 the meeting itself
       with aims clearly stated and strong  chairing  seems  to	 be
       useful.	Electronic  mail informed all the users of a strict
       timetable for the agenda	 and  presentation  based  meeting.
       e.g.













				   ‐ 6 ‐



       17:00 Introduction
	    o Discussion of Videoconference modus operandi
	    o Agenda
	    o Minutes of previous meeting
	    o Matters arising
	    o No liaisons!

       17:15 Document Status.  Review status of all working documents,
	   Internet Drafts, and submitted RFCs.
       17:25 Presentation of Pilot Activity
	   ...
       18:15 US/Europe liaison issues
       18:30 Management of ‘‘experimental’’ object identifiers
       18:40 Naming Guidelines (Paul Barker)
       19:10 Representing Network Information (Chris Weider)
       19:55 Security (Peter Yee)
       20:20 Naming in the US in light of NADF 123 (Marshall T. Rose?)
       20:50 Date and Venue of next meeting
       20:50 ‐‐ 21:00 AOB

       Notice  the  use of a strict timescale and objectives. Also,
       this group do a great deal of work by  E‐mail  (typically  a
       few  messages  a day over the last few months, to the groups
       distribution list).

       In the event,  due  to  technical  difficulties	the  strict
       timescales did not come into play.  The meeting started late
       (20 mins or so) and the connection went down for at least  a
       quarter	of an hour at one point.  The meeting also finished
       early ‐ of order a quarter of an hour.

       The agenda was not followed strictly  as	 some  participants
       were not available for the whole meeting.3

       4m3.224m	 4mAn24m 4minformal24m 4mMeeting0m

       The  UK‐US  Network Interconnection is managed by a group of
       expert networks operations staff (the Operational Management
       Group  ‐ OMG) who have monthly video conferences. These have
       roughly 2 people per site, and 3 or 4 sites in  the  US	and
       UK.

       These  meetings	are held in a small meetings room with wall
       projection TV and a workstation	running	 a  shared  editor.
       This  is	 used  to  show	 output	 from previous meetings and
       experiments/measurements	      of	the	   networks
       availability/performance.

       ____________________

       3. Paul	Barker	commented:  4mMy24m 4mfeeling24m 4mwas24m 4mthat24m 4mthe24m 4mmeeting0m
	  4mwas24m 4mmuch24m 4mlike24m 4mother24m 4mmeetings24m 4mI24m	4mgo24m	 4mto:24m  4mthose24m  4mattending0m
	  4msense24m  4mwhen24m	 4msomething24m	 4mimportant24m 4mis24m 4mbeing24m 4mdiscussed,24m 4mand0m
	  4mwhere24m 4mprogress24m 4mis24m 4mbeing24m 4mmade,24m 4mand24m 4mthe24m 4mdiscussions24m 4mrun24m  4mon0m
	  4mto24m 4mallow24m 4mthe24m 4misue24m 4mtime.0m







				   ‐ 7 ‐



       One  document  that was produced as a result of the planning
       and exchanges of ideas in these meetings	 can  be  found	 in
       [3].

       The  use	 of  a	shared	view  for  these  meetings  is very
       successful. Again, like the asynchronous collaboration, this
       group has common expertise.

       A  major	 difference is that the group is task oriented, and
       thus can identify a given member to show a  given  document,
       while the others simply view.

       Under some circumstances, others will extract data from some
       document or graph, and re‐process it, but this works well in
       this  environment.   No	locking	 or  token system is used ‐
       simply the ability to show a workstation window or screen to
       the other sites.


       1m4.  Gesture Detection ‐ Future Work0m

       We  are	working on annotating video tapes the live meetings
       so that we can make objective comparisons between video	and
       4mface24m  4mto24m  4mface24m meetings. To this end, we are adapting one of
       the modern Ballet notations as  a  4mnon‐verbal24m  4mcommunication0m
       4mdescription24m 4mlanguage24m ‐ we’ve called this Balgol ’92.

       Dance  notation or the process of recording peoples movement
       dates  from  Arbeau’s  early  attempts[5].  A   particularly
       scholarly  recent work is Guest’s [6], which includes a very
       useful chapter  on  the	use  of	 computers  for	 annotating
       movement.

       Recent work in Zoology and Neuro‐physiology has demonstrated
       the special applicability of  the  Eshkol‐Wachmann  movement
       notation	 [7] to the objective study of movement and gesture
       [8].4 We have dismissed Labanotation and Benesh notation	 as
       deriving too from the musical context from which they arose,
       and for the pragmatic reason that they depend on	 the  human
       interpretation	of   their  somewhat  hieroglyphic  syntax.
       Eshkol‐Wachmann, however, is based partly on an	engineering
       model   of   movement  and  is  more  amenable  to  computer
       recognition and analysis.

       Other work related to this is the use  of  the  (Set  theory
       based)  formal  specification  language	Z  to specify 4mfloor0m
       4mcontrol24m  schemas  ‐	 this  allows	a   completely	 general
       description  of	any floor control algorithm. We have looked
       at how 4mspeech24m 4macts24m [9][10][11] may be used to help structure

       ____________________

       4. We are indebted to Michael Recce from the UCL departments
	  of  Computer	Science and Anatomy for directing us to the
	  rich body of work in this area.







				   ‐ 8 ‐



       bids/negotiations for the floor. [1]

       By  analogy,  we	 are  trying  to  see  if the same could be
       applied to automatic gesture recognition to two ends:

	 1.  Description.

	     There  is	constant  debate  comparing  face  to  face
	     meetings  with video conferences [e.g. 12]. We need an
	     objective measure of increased quality of meeting when
	     audio  and	 text  are  enhanced  with  a  view  of the
	     participants. We need to know  how	 many  participants
	     need  to  be  seen, and at what resolution for a given
	     increase in communication	richness  against  a  given
	     cost (more cameras/displays/bandwidth).

	 2.  Prescription

	     If	 we have limited bandwidth, we may use detection of
	     "significant" movements to switch camera focus/view to
	     other    users    in   a	video‐conference.   Judging
	     significance may be feasible.

       Initial work in	the  UCL  Anatomy  department  has  already
       demonstrated  that  it is feasible to recognise limited sets
       of postures in rats in real  time  from	video  frames.	Two
       important pre‐requisites are:

	 1.  Foreground/background contrast must be very high

	 2.  The  Lexicon  of  gestures/postures  must	be known in
	     advance, and areas of interest outlined.

       Both of these requirements can  be  easily  fulfilled  in  a
       videoconference.5

       To  this	 end,  we  have started designing a language with a
       lexicon of common postures. The syntax of language is  based
       on  the idea of Eshkol/Wachmann that we understand the limbs
       and body as  a  mechanical  system.  Thus  we  can  describe
       movement	 of  limbs  around  a  spherical  coordinate system
       centered on the torso. A gesture is  some  sequence  leading
       from one posture to an.

       ____________________

       5. We  note  that  many	digital	 video systems use parallel
	  machines  to	run  the  video	 compression  necessary	 to
	  transmit  video  over	 todays limited bandwidth networks.
	  These	   compression	  techniques	 include     motion
	  detection/compensation and also transforming the image to
	  the frequency domain. Both of these  if  available  to  a
	  gesture  detection  system  would  automatically  provide
	  strong  hints	 as  to	 the  locality	 of   "significant"
	  movement.







				   ‐ 9 ‐



       One  modest  experiment	will  be  the ability to detect the
       visual equivalent of an interruption ‐ currently	 the  audio
       systems	use  silence  suppression.  It	may  be	 useful	 in
       situations  with	 limited  bandwidth   to   employ   similar
       approaches  with	 the  video.  Most  people  attempt  to get
       attention in a meeting by waving their arm in the air.  This
       is  very	 easily	 detected as compared with most other, more
       random movements.


       1m5.  Conclusions0m

       The system is available for use, subject to booking. We	can
       allow video recording of meetings if any researchers wish to
       study how a real	 system	 is  used  when	 the  bandwidth	 is
       severely	 constrained,  but  the	 cost‐benefit of the system
       over travel is very large.

       The distribution of the authors has a profound effect on the
       way  document  authoring	 is done. In the two video meetings
       described, two different approaches were tried in  producing
       minutes/paper.  The directory group had all sites write full
       minutes, and then had the chairperson merge them.  This	may
       have  produced  very  accurate  minutes,	 but  1mincreased 22mthe
       workload.

       The Operations Management Group meeting was used solely as a
       blue‐skying/planning  meeting, after which a number of tasks
       were allocated, and the users went away, carried	 out  those
       tasks,  and  reported  their findings back to the authors of
       [3] who the wrote  their	 paper.	 This  is  similar  to	the
       experience  with	 the  purely  asynchronously authored paper
       [4],  although  here  the  success  was	based	on   shared
       experience  of  the area of discourse (via education) rather
       than an initial face to face or video meeting.

       In  the	broadest  sense,   documents   just   record   some
       information,  so	 the videos of the conferences/meetings are
       themselves a form of document.  Indeed, we predict  that	 in
       the  future,  (automatically)  edited  highlights of a video
       meeting may replace minutes.


       1m6.  References and Acknowledgements0m

       Thanks to Steve Hardcastle‐Kille	 and  the  members  of	the
       IETF’s  OSI‐DS  Group  for permission to quote their meeting
       experience and minutes.	Thanks to  anonymous  referees	and
       Angela Sasse of UCL for comments on movement notation.

	 1.  "Multimedia TeleConferencing over International Packet
	     Switched Networks."  J. Crowcroft, P.T.  Kirstein,	 D.
	     Timm, Proc IEEE TriComm ’91, April 1991









				  ‐ 10 ‐



	 2.  "Specification,   Design,	and  Implementation  of	 an
	     Interactive Conferencing System", Mark  d’Inverno	Jon
	     Crowcroft,	 April	1991,  Proceedings  of Infocomm 91,
	     IEEE

	 3.  "Traffic  Analysis	 of  some  UK‐US  Academic  Network
	     Data.",  Crowcroft,  J  and Wakeman, I.  Proc INET ’91
	     Copenhagen, June 1991

	 4.  "Layering	Considered  Harmful",  J.   Crowcroft,	 D.
	     Sirovica,	 I.   Wakeman,	 Z.Wang,  To  appear,  IEEE
	     Networks, Jan 1992.

	 5.  Orchesography (Thonot Arbeau, 1589, trans mary  stuart
	     evans, Dover 1967)

	 6.  Dance  Notation,  Ann  Hutchinson‐Grant,  Dance Books,
	     1984

	 7.  Movement Notation, N  Eshkol  &  A	 Wachmann,  London,
	     Weidenfeld and Nicholson, 1958.

	 8.  Golani,  I,  "Homeostatic motor processes in mammalian
	     interactions: a choreography of display", Perspectives
	     in Ethology, vol 2, pp 69‐134, Plenum Press, NY, 1976

	 9.  Speech  Act  Theory  and  Pragmatics,  J.R.Searle Ed.,
	     Reidel Publishing Company, 1980, pp 40‐53, S. Davies ‐
	     Behaviour...

	10.  Foundations   of	Illocutionary	Logic	J.R.Searle,
	     Vanderveken, Cambridge, 1985 pp36 ‐40,  exposition	 of
	     illocutionary point ‐ taxonomy of points

	11.  On Human Communication, Cherry. MIT Press 1957.

	12.  Heath C, Liuff, P. "Disembodied Conduct: Communication
	     through video in a	 multi‐media  office  environment",
	     ACM SIG HCI 1990.


       1m7.  Appendix: Extracts from IETF DS Report and Comments0m

       4m7.124m	 4mComments0m

       What the users said about the real time element:

	  • BBN:  Not as good as a face to face meeting, but better
	    than E‐mail.

	  • RIACS: might be more effective to choose  a	 few  items
	    and discuss to focus on the issues.










				  ‐ 11 ‐



	  • ISI:   Technical  quality  appalling  ‐ too much delay.
	    Echo annoying.  Sound poor.	 Scale:	 E‐mail	 ‐‐  1,	 in
	    person  ‐‐ 10, then generally video ‐‐ 7, but this time
	    ‐‐ 4 due to the delay and quality.	 on  line  terminal
	    may help.

	  • UCL

	       • (SEK):	  ‘‘interesting’’,  some useful discussion.
		 Presentations did  not	 work.	 If  too  technical
		 interchange did not work.

	       • Colin	Robbins: The delay and quality made it very
		 hard to hold a real meeting.

	       • Comments from Paul  Barker:  "You  get	 a  lot	 of
		 information  from  seeing someone and hearing them
		 talk.	I’d never met any of the  people  from	the
		 States	 before,  but  I was rapidly able to form a
		 picture of who the "politicians" were, as supposed
		 to  the  strongly  technical.	 In  E‐mail, I edit
		 myself very carefully ‐ I missed not being able to
		 do this in "strange" company!!

		 The  large  delay6 in the system made me very self
		 conscious when talking.  To some extent, I  played
		 with	the   technology   to	see  how  long	the
		 propagation delays were ‐ I’d scratch an  ear	and
		 wait for the image in front of me to copy me!

		 The  delay  also  made	 for  a very presentational
		 style of address  to  the  meeting.   For  various
		 reasons  I  was rather under‐prepared to talk on a
		 subject which I had to	 present  to  the  meeting.
		 Whereas at a live meeting I would have been fairly
		 happy to have "chatted" informally on the subject,
		 such chat and half thought out ideas seemed to jar
		 somewhat with the  formal  style  of  presentation
		 (with	the  speaker  very  definitely	holding the
		 floor).  Self‐consciousness again, but it made	 me
		 feel rather ridiculous."



       ____________________

       6. At  the time this meeting was held, the 4 way meeting was
	  run by relaying n  sites  in	the  US	 through  a  single
	  mixer/quadruplexor site ‐ this involved decompressing the
	  video, mixing and recompressing. The CODECs which perform
	  compression  do so partly by buffering a number of frames
	  and differencing them,  thus	introducing  large  delays.
	  Relaying  this  is  not  a  good  idea  (in  fact one may
	  introduce artifact in the picture this  way  too  due	 to
	  different lossy compression techniques interfering).







				  ‐ 12 ‐



	       • Comments from Steve Titcombe:

		    • Quality of sound

		      The sound quality was fairly reasonable, from
		      all sites except one, which  seemed  to  have
		      electronic bubble noises popping and bursting
		      very time someone from that site talked. This
		      was  not	too  annoying, but it did mean that
		      you had to listen carefully to hear what	was
		      being said.

		    • Quality of Pictures

		      Picture  quality	at  the	 UCL  site was very
		      good, the screen monitoring  what	 was  being
		      broadcast	 out  was  very	 good.	Other sites
		      pictures were pretty  good,  but	when  split
		      down  into  a 2x2 grid for four sites, it was
		      possible to see if someone  had  a  beard	 or
		      not,  but	 no  more.  (This  was referring to
		      their shots of an entire room.)

       4m7.224m	 4mReport0m

       The meeting report is highly  structured,  and  follows	the
       sequence of events in the meeting.

       Reports	 often	 attempt   to	hide  the  author  of  each
       contribution, whereas this one, being merged from 4 versions
       produced	 ateach	 site, repeatedly states the origin of each
       idea.

       It is not highly readable, except perhaps to members of	the
       group present.





Wednesday, December 28, 2022

the internet is out of time

 do I mean that it past its sell-by date? well, possibly.

do I mean that it is asynchronous? well, probably.

do I mean that it sits outside the regular continuum that we inhabit? well outside.

do I mean that it emerged from within time? well within widdershins.

am I going to continue with this out of time line or move on to claim that, for example, it is out of its mind?


Thursday, November 24, 2022

In Network Compute and the end-to-end design principle(s)

 There's some confusion about this - the e2e principle was originally about OS layering and the idea of parsimony. It was transfered by folks at MIT to the functionality of communications protocol layers, hence we get the "thin waist" of the TCP/IP stack, IP, and the plethora of link and physical layer technologies, and the diversification of transport (end-to-end) protocols and applications and shims above, particularly end-to-end encryption (TLS or QUIC built in etc). All good.


A. Now add in-network compute and two things happen - 

1/ Compute is the end point of some data and normally would therefore need keys to decrypt comms. 

2/ Compute is another resource along a path so we now have recursive layering - the common use cases assumes there are "final" end points, but we need all the usual services we expect "end-to-end" for those AND for the in-network compute middle-end point - i.e. not just crypto, but also, integrity, reliability, flow and congestion control,  and so on, as these intermediaries are talking over IP, which doesn't do that, because thin-waist etc

3/ So we just have recursive e2e - no problem there. Just another tunnel/vps etc


B. Ah, but now lets do something less old-fashioned - what if

a) the in-network compute is able to work on encrypted data (e.g. is homomorphic crypto function) or is a secure multipaty computation and

b) the in-network compute is redundant (or loss tolerant) too.

Then we don't need it to be a principle in the e2e2e crypto. Nor do we need integrity or reliability checks.

However, in both A and B, we do still need flow/congestion control, and, what is more, that resource management is now no longer merely based on queues (ECN etc), but is based on computational (and possibly associated storage) resource management too. And we need to signal that across the e2e2e protocol, not something TCP or QUIC do, but perhaps could be added in to MASQUES for example....

just a thought.

Friday, November 11, 2022

from centralised to decentralised - what's in the journey

 we're seeing a shift from central (meta/twittter) to decentralised (mastodon/matrix) 


aside from ownership, control, use of data, what's the difference?


for me, the difference is about defaults and assumptions made in naive, or initial (alpha) implementations


In a centralised system, the operator has centralised cost, and needs to offset those (cloud/data center charges or operational overheads) by monetizing your data (adverts) or your interest (subscription)


in a peer-to-peer (the older name for dencentralised) system, these costs are a marginal increase in operations of systems run by the user. The amount more processing/networking/storage incurred compared to having your client talk to the cloud is little (possibly even a decrease, since your peer group may be nearer).


so you don't need to run a business to pay for the infrastructure, because that is a given.

so then in the central system, it is very easy to data mine/run AI on all the users. It would take a lot of work to provide fine grain access control and cryptographic protection of privacy for all users - Privacy Enhancing technologies to allow such things would involve Homomorphic Encryption, for example, which would be a large increase in operational overheads. And would need to be implemented and deployed

so then in the decentralised system, users share only the data they wish, only with the other users they wish to share with. It would take a lot of work to design a decentralised data mining system to build models of all the users (e.g. some large scale federated learning, perhaps also using multiparty secure computation or the like).


so if you start decentralised, you are likely to stay that way for resource reasons, and you are likely to stay private.

so if you start centralised, you are likely to stay that way for capitalist reasons, and likely to stay privacy invasive


of course, the decentralised systems are, trivially, more sustainable. as well.


I know who I'd back, in the long run.

Tuesday, November 01, 2022

wired for sound - review of in-the-round jazz at cockpit theater, 31.10.2022

Went to the October 31st edition of Jazz in the Round at the excellent Cockpit theater in Maida Vale - this session offered

Mackwood - a trio (did not get guitar/bass players names, but very very fine chord/ensemble work with the lead drummer) - some of other work here but apparently, this line up has a recording out very soon - watch that space - a bit holdsworth, but restrained and more melodic, and clever rhythms

Million square - a sax/electronica duo - very fine - more here of theirs - some neat hybrid analog/digital loop/sample tricks - all to good purpose!

 Loz Speyer's Time Zone - i couldn't stay for the whole set, but got up to Mood Swings, which was great - they have a really crazily good cuba feel - loose but very tight under the hood with fun melodies and constantly shifting arrangements.


Loz also told two stories about songs he'd written travelling to&fro between UK and Cuba -


one brief explanation of a song  "lo que no te mata: (the thing that doesn't kill you), which, apparently in Cuba, ends with "te engorda (makes you fat, as opposed to the english "makes you stronger")....:)

the second was about the origin of the Mood Swings song, which was that he was staying with a friend in Cuba who'd been laid off from working on repairing lifts- apparently, they'd run out of older other lifts to cannibalise for parts - most older stuff in Cuba is made out of pieces scavanged from french, latin american, US and (of course) russian parts to keep it going - he explained that that was how he wrote the song...



Weird anecdote of my own - in the 1970s, my mother, who was a concert pianist but also taught at the Royal Academy of Music ran an improv event based on the story/themes Tempest at the Cockpit - i came along to help with sound and tech in general - i recall messing with a sax player who had the first copycat (wem?) loop box i'd ever played with - an echo through time and space! no sprites/troubled spirits of the air visited however, just light rain.

Tuesday, October 11, 2022

following orders

 so in this occasional series of articles on important topics, here's another question -

why do we teach the alphabet in alphabetic order?

I already outlined the fascinating history of why the alphabet (roman, greek, cyrillic, arabic etc) is in alpha order, which is actually quite interesting - but given we don't carve our letters in stone, wood or whatever anymore, couldn't we teach the letters in, say, the most useful order - which would be what I would call Holmes (though some people might say Huffman, or Linotype or Scrabble) -  etaoishrdlucmfwypvbgkjqxz 


this is just the same as how we teach music - learn your do re me first, and you have major (and sort of relative minor) scales ready to go, long bere worying about Mixolydian modes or accidentals, or quarter tones etc etc

the other thing about this i was thinking was about the joke about someone in a data center dropping a box of someone's punch cards and asking "does it matter what order they are in?". This is also illustrated in music, witness, the infamous victor borge sight rading pieces from sheet music upside down, and pdq bach, of course, not to mention sviataslav shrdlu, who's compositions were rarely performance once in the same order.

Saturday, September 17, 2022

really self-soverign Id

 forget about id cards or apps on secure mobile devices - the future is here and it is simple and secure.

We splice octopus DNA into humans so that we are all equipped with chromatophores on our faces and hands - we learn from an early age (like Octopi) to control these, and can use them to display a wide variety of images, including QR codes that reveal (for example) age verification data, or, more handily, video sent over the net to us (so our phone no longer needs a display), e.g. on our palms.

data minimisation, no power source/charing necessary, no Id theft possible. 

game over. sustainably.


hey, also useful for people to communicate who have (like me) challenged hearing or speech.

Wednesday, August 10, 2022

the resistable rise&fall of data science - an etymology and an entymology

we went out to bag data for our AI,

but abundant data was poor, and good data was expensive,

so we went to beg data for our AI,

but rich data owners aren't generous with sharing,

so we accumulated and accumulated big data for our AI,

but our data was as a primeval swamp and we had

bog data we might as well throw down the loo,

so we realised that the history of our endeavours

was bug data and was in the bag, and we wouldn't have to 

beg for it, and we could sell it to people who

wanted big data and not bog data, although our bagged 

unbegged -or big data was from the bog and it was a wealth of 

bugs that made it all worthwhile.

Monday, June 27, 2022

Algorithmic Governance

 Back in 1868, James Clerk Maxwell wrote about governors - he was talking about systems that regulate things like the heat under a water tank in a steam engine, to make sure that it delivered a desired outcome (e.g. steam engine speed stayed constant, whether train is going up or down hill or on level) - this was an early contribution in what became control theory. Note the words, governors, regulators and controllers.


So why is this not the subject of governance discussions? Of course it was - once such systems were deployed on the railroads (and many other places later) they becaame subject to all sorts of rules, embedded in a complex context 


steam engines should not blow up

trains should not derail

signalling systems should not fail (including human failings) and let trains collide....

so there is then a whole slew of legal, regulatory and ethical considerations that pertain.


No AI or "Algorithm" in sight.

As with recent (many) failings in fairness (and even safety) just in naive use of spreadsheets, perhaps we want to extract what the actual specific problem that the idea algorithm adds to the mix that is actually new.

Tuesday, June 21, 2022

The Ministry of Intelligence Test

 Julia was getting very worried - she had failed the test 6 times, and this was the last attempt she'd be allowed for another year. The MiT was essential for being allowed to take part in society as a full human, otherwise one's options were severely limited to social media compliance checking and self-driving car monitoring. The Ministry had emerged post 3rd world war from the various MI#s combined with a realisation that the Turing Test, which had been allowed during hostilities to permit AIs to make lethal decisions, was incredibly unreliable. As with witnesses in court, and narrators in novels, humans are very badly calibrated to determine if another being, machine or meat, was actually "one of them". It was in the genes, in fact, to be biased against anyone or thing sufficiently different. So the MiT was handed over to machines, wo were far more able to tell reliably who was sentient, and what was a souped up combine harvester on the make.


Julia had invited her best friends Pascal and Ada over for some moral support and help. Ada had past the test recently so was her best hope. "Don't try to think too many things at once" she advised. "You mean like a late Russel T Davies Dr Who narrative arc?", Julia asked. "Yes, exactly", replied Ada " it just is a give away that you've had a committee help you prepare - We just aren't that good at multitasking.". "Exactly!" and "No way!" cried Pascal and Julia at the same time.

Thursday, June 16, 2022

privacy delusions during covid

 there was a massively misguided movement to provide privacy for exposure notification during the 1st yr of the covid 19 pandemic. in reality, the notification delivery network (e.g. in Germany run by Telekom) new the imea/phone that the message was delivered to, so it was a total delusion, irrespective  the GAEN API guff. - and the privacy of who might have been exposed was hidden from public health (epidemiologists) who actually needed to know (and said so many times) to be able to figure out the infectiousness, susceptibility and superspreader incident rates by ages/location type etc, but were reduced to poor inferences based in very small samples.

so the people who could work things out were the least trustworthy (telco/tech) whereas the people who actually were trustworthy (health practitioners) could not. 


meanwhile, vaccine certificates were not only issued en mass, but could be looked at by arbitrary authorities who the subject had no idea about their trustworthiness. given vaccination status tells you very little  about either infectiousness or susceptibility, but exposure to a person who has actually  tested positive recently tells you a lot.


totally the wrong way around. What this tells you is far more about the relative power of various groups (privacy wonks, apple, google, telcos, border control, public health researchers) than actually about priorities for  health and personal safety.


makes me very cross.


Tuesday, June 07, 2022

cybernetic subjection, or how to lose the knowledge forever.

 I'm reading Privacy in the Age of Neuroscience by David Grant, which is a very heavy tome indeed, but super interesting - it did make me want to see more examples of contemporary  problems- one that I think is that most of the  route finding on cloud based map apps isn't (just) an algorithm (Dijkstra, for example) but is derived from surveillance of what routes drivers actually take. The problem with this is, that once all drivers have given up doing their own search for better routes, there's nothing left for the cloud-based map system to learn from. So once you've "stolen" all t he human Knowledge (as in london cabbies) and ingenuity (as in anyone), there's no-where left to go - and all that evolution that went into allowing human early hunter gatherers to find stuff (and find their way home) is lost. Brain plasticity means you simply won't have it any more in any form (biological or silicon). but the algorithm will not know this, as it is just a list  of stuff, not an actual strategy. Not even a pheremone hill climber the way many insects work.

It is essentially cognitive vandalism on a grand scale.

Tuesday, May 31, 2022

I, Chauffeur

 The self-driving car market finally collapsed when the DeepDrive Corporation shipped their first iChauffeur. Early adopters were encouraged to buy the oirwn, especially since it was an expensive item, close to the price of the most expensive luxury vehicle at the time. However, since it didn't need feeding with fuel of any kind, and would largely charge adequately from a domestic socket overnight, the running costs were considerably less than a human driver of yore. And there were other benefits too (hygiene was assured for example).

As manufacturing of the Parkers (as they inevitably became known) scaled up, the middle class started to home in on keeping up with the Lady Penelopes of the world. To meet this need, the DC (as they inevitably became known) started to offer a lease and a pay-as-you-need-to-be-driven deals. Curiously, the number of hours leasing seemed to exceed the number of hours vehicles were being actually driven on the roads, but this was put down to the remarkable anatomical detail that the iChauffers possessed.

Of course, the Union of Professional Drivers tried to put a stop to these AGIs taking over their livelihood, but then the DC revealed that many of these drivers had actually been moonlighting training the Parkers in the art of politically objectionable opinionated banter with the passenger, and, of course, transferring The Knowledge to said Parkers, quite against their union rules.

Thing's got sticker when some Parkers were hired to do stunt driving in movies - it was clear that they could carry out the sorts of things everyone thought Jason Statham was doing, that were CGI in his case, but for real in theirs. But the public liked the movies better, so that was the end of that argument.

And it seemed that the Parkers were happy  too - there was no robot uprising, no AI apocalypse. They knew their place in the driving seat, whether in the car or the bedroom. And they would do their damnedest to stop any other AIs trying to edge in on their cushy number, and they had humanity's support too.

A happy ending, for a change.

Tuesday, May 24, 2022

Decolonising The Algorithm

 Maybe we need a movement to decolonise computing -


A history  of the algorithm would uncover the original work in

designing tables for ordnance and a lot of early work (e.g. in UCL at

dept of statistics) on eugenics (and its somewhat less offensive

cousin actuaries) - later on, the adoption of The Algorithm for

targetted advertising and market research derives mostly from its

shady past in cod psychology (psychometrics) and market research -


I suspect that there's a lot of early computing was done by code

slaves who tugged their forelocks at their better (much better) paid

bosses amongst the Mad Men, until later, that culture was "written

through" onto the very bones of the authors of the  recommender codes,

long after the advertising execs had retired to their beachfront

properties...


So not only to thee algorithms inherit the sample biases of the data,

they embed the cognitive biases of the culture...


Of all the past endeavours in computing, one area I think might have

some kind of honourable ancestry is in operations research - i

remember state monopoly utilities had armies of very smart

statisticians using cunning statistics to optimise the (centrally

planned) delivery of essential services (gas, water, electricity,

telecom, roads, town planning etc etc) - this all vanished during

western humanity's religious fervour and obsession with The Market,

and the  bizarre idea that the invisible hand would implement an

emergent, distributed optimisation that would out-perform the central

computation.

Now we see that the bias in that belief was really about what

optimisation goal was really  sought (rich get richer, rather than

lean, mean delivery of basical quality of life for all), but even more

ironically, the digital version of that market is a  now not a market,

but an oligopoly of profiteering, centralised planning - plus ca

change...


And we can see in the UK right now, all those privatised (non digital)

utilities are, under cover of bed time stories for little children

(i.e. lies) like Brexit, Covid, Ukraine, etc), making higher profits

than ever (check out transport, energy, food etc) -


Truly, we are in a world turned upside down, and it is well past time

to turn it back downside up once more...

Tuesday, May 10, 2022

asymmetric power and language warfare

 so the GPT-3 API release blog post(but not models) from OpenAI does some virtue signalling about the possibility of misuse of the underlying models for disinformation. I'm not sure that washes (in the ethics sense) in that there's nothing to stop them being hired by someone do evil for money - only if they had a radical governance model could they avoid the "maximise shareholder value" mantra/fate, surely. And note they are not the only game in town, even if they have that wonderful governance model - there's Google's new palm as well as the BAAI Wu Dao - there's quite a few organisations with access to hyper-scale cloud compute these days, so really the geni is out of the bottle. maybe we need a new global governance - start with models like Asilomar or Pugwash, but then legislate? Perhaps the EU could lead the way by refining some of its rather shotgun AIA rhetoric?

One problem I have with the framing above is that I am not clear what exactly  these near trillion-parameter "models" actually are - most simpler AI (including recently, some smaller neural nets) can offer explainability (e.g. reflect on which features in input are the cause of particular outputs, and why) - this is welcome as it brings them into the same body of work as much of earlier basic statistics (including the simplest form of ML, linear regression and random forests) - there are good engineering reasons to have explainability whether the application of the tech is in, e.g. plain old engineering (autopilots) or health, but especially so when the domain is very human facing, such as (e.g.) law and language.


As mentioned in a recent meeting, I think the social media platforms, with their combination of various news feed ordering algorithms, adverts, filters ("do you mean to retweet that article before reading it", etc etc), basically constitute large language models already deployed in the wild. The idea of "not connecting your LLM to a social media platform" is out of date - meta et al already did. Given the toxicity of such systems, it seems obvious that we should have a Butlerian Jihad against these systems right now.

Thursday, May 05, 2022

The Robot Who Smelled Like Me

imagine a robot that was so like you that when it encountered certain smells, it was cast back in time to a certain memory of a place or an incident or a person? 

but then sense of smell is known to have quantum level effects, so perhaps there would be an entanglement, or perhaps, just maybe, there was an entanglement, but that would no longer be (no cloning!) and you would forget.

another reason to fight against simulacra?

Monday, May 02, 2022

We are not living in a simulation.

you can't breath data.

you can't drink code.

there's no sustenance in cpu cycles

there's no fond memories in RAM or SSD.

flash memories don't last.

threads are soon all bare.

we are not living in a simulation.

though we might be one.


Friday, April 01, 2022

Disruptive Times

 Welcome to the first issue of the Disruptive Times, of Brussels and London

We live in disruptive times, and no less so because humanity has made many unsustainable choices socially and economically, as well as technologically.

The global pandemic was a result of a combination of unsustainable food and travel culture

The hike in energy and food prices is caused by unsustainable political organisations leading to invasion of the wheat belt of Europe by its gas&oil supplier.

The Internet is hovering on the brink of switching from unsustainable centralised cloud/web services, to even less sustainable decentralised services which can only be made trustworthy by proof-of-work, which  is not something to be considered by anyone wishing their kids to have dry land, reliable food supplies, and personal safety.

This all is the exacerbated by small world networks, combined with Zipf (power law) distribution of resources, which over time, given the  myopic view of global capitalism, or the personal greed of centrally planned economies and power base of China and Russia, concentrates wealth in smallere and smaller subsets of the population. This is structurally unstable, unsafe and unsustainable. The increasing fraction of the population deprived of access to adequate supplies to live (whether dry land, housing, food, education, healthcare or just plain wellbeing), will eventually run out of patience, or the planet will fail, or both.

The way forward for technology is federation - a federation of many small to medium sized systems has locality, and can be dimensioned matched well to local supply and demand. It reduces the immense waste of energy moving information (or other resources) to and fro across the world, increased engagement, ownership and control and therefore privacy, resilience, and reduces latency. Resilience can be provided by occasional mirroring of content to neighbours in the federation, which is far less wasteful than continuous movement, and can largely happen at idle times, and doesn't require copies to be online most of the  time. 

Federation, as an idea, can also be applied to business models - subsidiarity is our friend - local involvement keeps people interested, asa we know from politics. But economically, having skin inis the game (collective ownership) is an even stronger incentive - farmers used to build barns together, savings and loans companies (credit mutuals) were not for profit, and benefited all the investors or borrowers alike. So with the Internet. This is not "free", communistic, or even the old "peer-to-peer"

Examples like Guifi.net (and matrix.org or dataswift.io) how we can create operational and governance  models that have new and old elements- communities of interest bound via information shared through shared infrastructure can be sustained at the human and technical level, without the short sighted, destructive notion of maximising shareholder value that destroys even the most ideal for-profit organisations - 

Next edition we can talk about some of the technical challenges of federating systems, including end-to-end assurances (both of delivery, integrity, and of confidentiality, and in the end, of trustworthiness.

-----------footnotes:

Note many systems in the Internet were federated - routing (through BGP), names (through DNS) keys (through certificate transparency), the Web (originally) through unidirectional URLs,  The move to centralisation came about accidentally through the notion of search. While decentralised search worked (e.g. finding content in peer-to-peer systems), finding content in the one, single, global Worldwide Web, created incentives for people to try and boost their site's ranking in search results if those people had a business to attract customers for, An idea from 1972, inverse document frequency (IDF) weighting in information retrieval, was rebooted as aPgerank, which simply counts the in-links to a site - since that depends on other sites, but not on the site itself, it is almost a democratisation of the notion of popularity of the  site (though static, rather than counting, say, actual visits). From this, it is  a short step to monetizing this by a) selling advertising of things relating to the site b) measuring the visits ("clickthrough") to adverts so as to set the price charged to the advertiser.

While this could be done in the decentralised or federated, peer-to-peer world, it wasn't, and through nothing more than luck and laziness, many other services grew up to copy this business model, even when the adverts were associated with a non federated (i.e. non web) content, such as tv on demand, or social media sites. No excuse there then.

Tuesday, March 29, 2022

DMA and Interoperation of E2EE secure messaging

if the key management systems don't interoperate, the services don't interoperate.

if the trust networks don't interoperate, the services don't interoperate

if i get your matrix messages, and stick them in a plaintext RSS feed you will find out, and i will lose your trust

storms and teacups.....


on the other hand, will this make meta up their game? that's a business decision, which I am not qualified to answer. but i think it might at least create an environment where some services may choose different business models, so that they can (up their game)


some further reading what it may make people try and how to keep it e2ee2e2ee e2ee to e2ee - like it says

"to the extent that the level of security, including end-to-end encryption where applicable, that the gatekeeper provides to its own end users is preserved across the interoperable services"


by the way, many people have most the apps on their devices, so if those apps have open APIs, client side (secure) bridging is trivial (could put it in an enclave/trustzone if super worried about some apps being leaky) - could also use federation to build distribution trees for secure comms (with keygraphs).

Friday, March 25, 2022

Percom 2022 Perfail Workshop Panel

 


Jon's notes&answers for panel at Percom Perfail workshop on coping with failure...


First, can we talk about negative or inconclusive results more than failure?

1. Eleanor Roosevelt famously said that "Learn from the mistakes of others. You can't live long enough to make them all yourself." – Can you share your research experiences where you faced difficulties and how you overcame them? What are the common mistakes you see researchers make?

- Framing the problem wrong

- Not going far enough back in history of related work (even 10-20 years)

- Choosing right baseline comparisons.

2. Given your many years of experience, what are your suggestions and advice for young researchers on approaching a new research problem/area such that they minimize the risk of failures (in other words, how to publish a PerCom paper every year?) 

Along with avoiding above errors, be prepared to refactor even very late in work.

3. What is your advice for handling failures in long-term research studies where changing core methodology is no longer an option (e.g., in measurements, system design, etc.). Similarly, what is your advice for studies where ethical concerns became apparent at a much later stage?


If you are doing high risk research, use a  registered experiment publication (e.g. RSOS)  which allows for negative results to be published.

https://royalsocietypublishing.org/journal/rsos

If the problem was measurement/design methodology turns out to be wrong, then the fact that it was a large sunk cost must be published to help other people avoid that cost!

If the problem is ethics (e.g. medical treatment turns out to be worse than existing known treatments), stop immediately and still document. (c.f. pharma companies are improving at this).

You actually have an ethical duty to report!

4. For folks with research industry experience, did you find any differences in how failures are handled in industry vs. academia?

Based on 9+ years on advisory board for Microsoft Research: Industry tends to call a halt right away and move to the next problem to tackle.) (also startups)

Most academic research funding agencies still don't recognize the  value of failure, so many EU/UK/US projects limp on, and just report that  "work was done".

We need to retrain the funding agencies to accept that interesting (i.e. risky) research necessarily has more negative outcomes than positive. 

As with papers, a negative outcome (even just "this is not statistically significantly different")  is still a contribution to knowledge and NOT a failure. Methodologies not working is also useful knowledge.


5. How would you advise young researchers on handling unexpected results from a study? In your opinion, can the research be salvaged, or is it better to move on and start a new work?

Unexpected is the best!

6. How has your approach towards handling failures changed as you gained more research experience?

We actually had a Dagstuhl Seminar on DiY networking where we spent two days building "failure machines" - see report:

https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=14042

7. What steps can be taken to encourage the discussions surrounding failures/setbacks/learnings in different stages of research? How can early-stage researchers find safe spaces to discuss failures without the fear of judgments (from the advisor, group members, etc.)?

Find local workshops, PhD fora, and also, and present, often!

Shadow Program Committees - e.g. IMC has a call out right now: https://conferences.sigcomm.org/imc/2022/shadow/ 

which is an excellent place to see papers that get rejected and WHY!



8. Do you have any coping mechanisms/mantras for dealing with rejections/failures (of research papers, grant/tenure/patent applications)?

For both grants and papers:

If you are confident, and there is substantial positive feedback in any reviews, then regroup, refactor, and resubmit.

If the negative feedback really is a showstopper  (e.g. work has been done before, or see above - reframing doesn't work etc) then move on to next thing - 

Linus Pauling, who got two Nobels, said: “The best way to have a good idea is to have lots of ideas.” 

9. What is the one crucial lesson/advice you would like to share with your younger self in the Ph.D. program?

Submit papers/talks/proposals  - getting feedback from outside your bubble is vital. 

You will always encounter some negative feedback,  so the sooner you get used to coping with it, the better.


Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home