Using Local Inference in Massively Distributed System
EU FET-Open FP7
As the scale of today’s networked techno-social systems continues to increase, the analysis of their global phenomena becomes increasingly difficult, due to the continuous production of streams of data scattered among distributed, possibly resource-constrained nodes, and requiring reliable resolution in (near) real-time.
We will explore a novel approach for realising sophisticated, large-scale distributed data-stream analysis systems, relying on processing local data in situ. Our key insight is that, for a wide range of distributed data analysis tasks, we can employ novel geometric techniques for intelligently decomposing the monitoring of complex holistic conditions and functions into safe, local constraints that can be tracked independently at each node (without communication), while guaranteeing correctness for the global-monitoring operation. While some solutions exist for the limited case of linear functions of the data, it is hard to deal with general, non-linear functions: in this case, a node’s local function value essentially tells us absolutely nothing about the global function value. Our fundamental idea is to design novel algorithmic tools that monitor the input domain of the global function rather than its range. Each node can then be assigned a safe zone (SZ) for its local values that can offer guarantees for the value of the global function over the entire collection of nodes.