Many real-world data sets are encoded into weighted graph structures (e.g. migration flows, airline connections, energy networks). There is an increasing interest in designing efficient algorithm to analyse such data. Big data representing non-regular modular structures pose serious computational and conceptual challenges even in simple visualisation procedures, node classifications or clustering methods. For such problems randomised algorithms can be crucial and can outperform deterministic algorithms.
The project aims at investigating random forests and related random-walk based sampling algorithms to probe the architecture of an arbitrary weighted graph. From the fundamental side, the focus is on the study of scaling limits of random spanning forests on weighted graphs representing hierarchical structures, which has connections to the theory of uniform spanning trees and determinantal processes. From the applied side, the focus is on designing a renormalisation algorithmic method to identify, in a multi-scale fashion and with the help of random partitionings induced by spanning forests, densely connected subgroups of nodes in data sets that are encoded into a network. The renormalisation scheme to be developed will be tested on data sets from different areas.
The project combines ideas from probability theory, combinatorics and algorithmics.
|Luca Avana (UL) and Alex Gaudilliere
|Universiteit Leiden (UL