Title: Algorithms for big data

Lecturers:  Paolo Ferragina and Andrea Marino

Period: First 5 lectures: 18, 19, 20, 25, 26 november, 9-11, sala seminari EST
Last 4 lectures: 18 March, 15 - 17 Sala Gerace;  20 March, 11-13, sala seminari EST;
25 March, 15 - 17,  Sala Seminari OVEST; 27 March, 11 - 13, Sala Seminari EST.

Schedule:

Lecture 1. Data streams and sketching: Distinct counting, Frequent Items, Probabilistic Estimators, Similarity and Min-hash sketches.

Lecture 2. Sampling: sampling in networks, markov chain sampling. Applications to estimating properties of large networks.

Lecture 3. Mining in large graphs: social network analysis, clustering, community detection, spectral techniques.

Lecture 4. Dimensionality reduction: Projections, Locality sensitive hashing, Johnson Lindenstrauss.

Lecture 5. Large-scale distributed programming frameworks: Map-reduce, Pregel and examples of computation on large graphs.

Lecture 6. Diameter computation in real-world huge graphs.

Lecture 7. Distance distribution approximation and introduction to probabilistic counting.

Lecture 8: Probabilistic counting with applications

Lecture 9. Centrality measures in complex networks.