Tuesday, October 24, 2023

In-network processing - do we ever really need it?

 We've looked at this problem from several sides now - to solve the "incast", to do aggregation for map/reduce or any federated learning platform, to aggregate acknowledgements for PGM.

When we say "in-network", we're talking about in-switch processing - borrowing resources from the poor P4 switch to store and process multiple application layer packets worth of stuff, so that only one actual packet (or at least a lot less) needs to be sent on its way.


So how about we compare with multicast (in network copying) and its (largely) replacement by CDNs/overlays.

Key point is branches in the net - this is where the "implosion" (for incast) or "explosion" (for multicast) happens:

So do we have a server nearby? Or can we just put one there (or just connect one there?


Answer is (for multicast yes:

netflix/pops in wide area - use distribution trree to all pops, and caches

So in data center: 

use servers, not switches and build sink forest of trees

clos system, connect servers to local switch, top of rack, and spine switch/server...then for servers at some level, use a node at the next level up as aggregation server (note Clos even has redundancy so this will survive edge/switch outages)

1 comment: