a lot of distributed systems folks get a bee in their bonnet about eventual consistency, and device ever more baroque solutions to the problem of byzantine fault tolerance (and some lovely work, don't get me wrong - the Byzantine-Altruistic-Rational-Tolerant work is an awesome synergy of so many cool thoughts, and is useful)...
however, a lot of systems are never actually consistent, but are still useful. two examples to illustrate
1. the world wide web - people berated Berners-Lee when he devised uni-directional URLs, as leading to inconsistency - dangling references abound, but the web (including the spiders&robots that index it for search engines or LLMs) work. You just crawl what is there and leave out what is not currently reachable...you can devise all sorts of heuristics based on stats of what you find over time, to device when to revisit...
2. routing - vectoring algorithms (RIP, BGP) are possibly eventually consistency, but the convergence time often exceeds the mean time between failure/repair, so that the system of routes never converges. Vectors stored at different nodes in the net point in directions no-longer "globally" consistent - however, forwarding along those paths may encounter changes of direction as packets encounter more up-to-date routers and get their progress "put to rights" - of course, one problem this can lead to is transient loops - these are eliminated in many protocols (BGP has one trick, there are plenty of others). So then how "bad" are the temporarily divergent routes, and what do we save in routing protocol/computation overheads by not worrying about this, if many packets get to the destination, albeit via the scenic route, rather than the "best" by some metric?
We don't really know. But we should, as we could actually design algorithms that work in the presence of inconsistency and have some sort of predictable performance across some range of topologies and failure/repair statistics.
Sure, we can also devise algorithms that attempt to maintain global consistency (c.f. link-state routing) but these may have scale limits. In general, I feel that some sort of thinking about diffusing computations intermediate states' trajectories and consequential effective actions seems warranted. There has been some work (I know one of Tim Griffin's students had some neat ideas in this space) but it might need a new approach, which might yield value elsewhere - this is not quite as clean as (say) CRDTs which divide and conquer the consistency problem, but some similar kind of thinking might apply.
No comments:
Post a Comment