Thursday, March 30, 2023

databox 2.0 - the manifesto

 databox and hat were two personal data store projects (sometimes "edge cloud", sometimes "self soverign data") that were trying to solve the practical and business challenges perceived nearly a decade ago, also recognized in the web community in their solid architecture.

while all these initiatives were somewhat sucessful in their own terms, what they've not managed to do is actually create a platform that others (e.g. mastodon or matrix) could "just" use.

Why not? What's missing?

Manageability. Compared with the cloud (centralised?), edge systems lack some properties which are really non-optional for storage and compute services today (30th March 2023)...


This list is a start on what data centers and cloud computing achieved, but edge has not really delivered platforms for:- Centralised systems provide these using techniques that could be re-applied at the edge, but one of the key missing pieces is what makes cloud "scale out", which is amortising the cost of providing these mechanisms over many many users. In the edge case, you actually have precisely the opposite, which is that you have to increase the cost of participation as you add users, as opposed to (sorry:-) exploiting Metcalfe's law of net scaling...that the value of the net increases super linearly with the addition of each "user", since they can also offer a "service" - i.e. super-additive value, rather than just adding a burden on the overall community. Lets be specific - this isn't about content, this is about compute and storage demand - of course, a decentralised or federated system could still exploit users content, although that is counter-indicated by most peoples' motives for using such systems. But the fabric has to provide:

  • availability - access networks and edge devices fail. so do core and metro links, and system units and racks in data centers, but much less frequently. To mask failures typically involves adding redundancy - e.g. multihoming systems, and running replicas. Replicas need to be synchronised, typically via some consensus scheme (or CRDT) - in the decentralised case, this is harder because failures are more complex and number of replicas may need to be higher (have seen estimates that you need about 6 fold to be as good as a cloud provider, who might use at most 3-fold).
  • trustworthiness - a cloud provider makes a deal with you and can be trusted or else goes out of business. Your peer-group in an edge cloud may well include bad actors. so now you need byzantine fault tolerance, which is a lot harder than just having majority consensus.
  • persistence - cloud providers have mostly shown that they are not "fly by night" (or the ones that are have gone bust). Personal devices come & go - I think I've seen estimates that people change smart phone about every 2-3 years (many contracts are designed around that) - while newer devices have more storage, typically they arrive with very simple cloud based synch with old devices (though android&IoS can do device-to-device synch of course) but also the users have tablets, laptops and workplace systems that also all need synching and expect long term survival of their content (as well as any ongoing processing) - e.g. personal content like email/messaging, photos/videos, even music, films, games etc) - cloud based systems have amortized costs of backup/replicas, and indeed, deployment of new storage tech at a wholly different scale from the edge. New technologies (synthetic DNA, silica/glass storage) are already being looked at for the next steps in that context but are years away from being affordable/usable for edge devices (if ever, given their slow write times....do edge users have the patience or personal persistence to take care of that?).
  • capacity - while newer devices have impressive storage, that just makes things worse, since one moves from just taking photos on the phone, to HD video and on to augmented reality and so on.
There are other no doubt other things one could add (complexity of key management not the least, although recent thinking in that area might provide some solutions) but for me this needs a concerted effort to re-think the job - If I only depend on other users' edge devices, then how to I find the 5 other users and what is in it for them, to replicate my services so that I can continue to see them wherever and whenever I need? How do we deal with continuous upgrades over vast numbers of disparate end users' systems, rather than vast numbers of systems in a small number of administrations? What are the ways one can seek redress in such a world, for data loss or breach of confidentiality. And on and on...

So one approach that is promising is what has happened with Mastodon, which is only "somewhat decentralised" in that each instance serves potentially quite a few people, and indeed, may not be running on the Raspberry Pi in someone's attic, but on a cloud infrastructural service. The key point is who controls access to applications&content, not where  the bare metal compute is located. At least for now. Indeed, even if one distributed the computers, one is still largely still dependant on centralised electricity generation (ok, so I have solar, but how many people do...?).

So decentralised keys, but somewhat centralised, but migratable services looks pretty ok to me...to be getting on with.


There's a temptation to claim that a centralised system can emulate the federated/decentralised properties concerning control/ownership/access by encrypting data in store, transmission, and processing (enclaves or homomorphic encryption or secure multiparty computation) hence partitioning the data by key/function management, rather than spatial/device ownership. While this might work in terms of data use it doesn't deal with several centralised system failure models - including (non exhaustive list) denial of service (deletion intentionally or by central organisation failure/business exit, change of T&C etc); small gene pool of software hence likely vulnerability that strikes all customers of the central service at once (even though data breach might not happen, but it might too, whereas the cost of such an attack on a large heterogeneous system might be far less likely to impact as many users; However, on the other side of that coin, if I want decentralised systems that use replication that itself is heterogeneous, I need to have the copies run on other peoples' systems, who might not be any more trustworthy than the cloud/central providers....so one might still want enclave, and homomorphic crypto on the social net friends replica systems, but the crypto then mitigates against some of the energy savings of being at the "edge". Indeed, edge to edge communications might be very much more complex than edge<->central, which the network has been optimised for. Replica consistency in the edge case, is a relatively new thing, and consensus protocols havn't been stress tested well any where as much as necessary to say they'd be as good as the simpler designed cloud/data center replication environment's use of algorithms like Raft etc...

So it is not quite so simple to compare decentralised high availability systems with centralised high confidentiality systems.


No comments:

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home