PROGRAM SCHEDULE
(subject to changes)
Tutorials
Exploring Complex Networks: Theory, Metrics, and Structure Detection in R & Python
by Andrea Montano Ramirez
In this hands-on tutorial, we will dive into the fascinating world of complex networks—from fundamental concepts to advanced structural patterns—through a practical and comparative lens using R and Python. We will begin with a concise introduction to network science, exploring real-world examples from social systems, biology, and infrastructure. Participants will learn how to construct, visualize, and manipulate networks using popular packages like igraph (in R and Python) and networkx (in Python). The tutorial will cover key network metrics—degree distribution, centrality measures (betweenness, closeness, eigenvector), clustering coefficients, and assortativity—as tools to quantify structure and identify important nodes. Moving beyond basic measures, we will introduce k-core decomposition to uncover hierarchical node organization and core-periphery structure detection to differentiate densely connected hubs from sparse peripheries. Through hands-on coding exercises and real datasets, we will illustrate how these methods provide insight into the robustness, vulnerability, and functional roles of different parts of a network.
Neural topic modeling with Bertopic
by Arnaldo Santoro
This tutorial provides a hands-on introduction to BERTopic, a Python library for state of the art topic modeling that combines transformer embeddings with classic keyword scoring to generate interpretable topics from textual data. The session covers core functionality, customization, and visualization tools. In addition to practical, code-driven examples, the tutorial will offer guidance on when BERTopic is most appropriate—discussing considerations such as the optimal number of documents, document length, and suitable types of textual data. By the end of the tutorial, participants will understand how to effectively apply the standard BERTopic algorithm in real-world scenarios, how to customise it, and integrate it into their NLP workflows for meaningful topic discovery and analysis.
Analysing Large Open Human Mobility Data in Spain
by Egor Kotov
In this tutorial, participants will learn how to access and analyse large-scale open human mobility data for Spain derived from mobile phone records using the spanishoddata R package. They will gain hands-on experience accessing and analysing months or even years of mobility data locally on their laptops using a serverless DuckDB database. They will learn how to use the dplyr and duckdb R packages to efficiently manage out-of-memory computation, a technique that can be applied to a wide range of datasets and allows working with larger-than-memory datasets even on a consumer-grade laptop.
The 15-minute tutorial is structured as follows: a 4-minute introduction to the mobility dataset, a 4-minute discussion on out-of-memory computation and an explanation of how DuckDB works, and a 7-minute demonstration showcasing data summarisation and visualisation using the flowmapper and flowmapblue R packages. Afterwards, participants will have 15 minutes to work on a hands-on exercise, where they will be guided through the process of loading, aggregating and visualising the data. During the tutorial, participants will either access relatively small data online or get larger versions of the data (14+ GB and 60+GB versions) from the USB storage provided by the instructor in class.
Making your research accessible and reproducible by tracking software dependencies, using pipeline build systems and containers
by Egor Kotov
In this tutorial, participants will learn how to design and maintain their research projects to improve code structure, enhance reproducibility, and increase robustness. They will also learn how to share projects easily and preserve the computational environment in containers, ensuring the long‐term reproducibility of their analyses. Practical examples will be provided using R packages such as renv for package dependency management, targets for analysis pipeline management, and Docker and Apptainer for containerisation.
The session is divided into two parts. The first is a 15‑minute presentation that covers reproducibility challenges, the importance of tracking dependencies, and managing analysis pipelines using build systems such as make, targets in R, and Snakemake, alongside containerisation with Docker and Apptainer. The second part is a 15‑minute demonstration showcasing real‐world examples of making legacy code reproducible with these concepts and tools. This demonstration will highlight how to future‑proof current code and illustrate how these techniques can be applied when working locally, on high‑performance computing (HPC) clusters, and when sharing projects online via services like mybinder.org, which allow users to run the analysis in a web browser without any installation.
Due to time constraints, there will be no hands‑on session during this 30‑minute tutorial. However, participants will be provided with detailed, step‑by‑step instructions for each of the processes covered. A recent version of this tutorial with instructions and code examples is available at https://www.ekotov.pro/2025-mpidr-open-science-reproducible-workflows/
Investigating higher-order interactions using a recently developed Python toolbox: HOI
by Mattero Neri
Over the past two decades, network science has provided novel perspectives across a wide range of research questions and complex systems. Traditionally, however, network-based approaches have focused primarily on pairwise interactions, often neglecting interactions involving simultaneously three or more elements—referred to as higher-order interactions (HOIs). A growing body of research has recently underscored the critical role of HOIs in shaping the behavior of complex systems. Taking into account HOI can yield more accurate and comprehensive explanations of diverse phenomena, from information integration in the brain to the spread of epidemics in society.
But how can we effectively study HOIs in complex systems? One promising approach is grounded in information theory, which allows for the quantification of HOIs through measures of synergy, redundancy, and other informational properties. To support this line of research and to make the investigation of HOI feasible in practice, we have developed HOI (https://github.com/brainets/hoi), a flexible and efficient Python toolbox for computing these metrics on any type of multivariate data. Built with cutting-edge tools like JAX (https://docs.jax.dev/en/latest/quickstart.html), the toolbox combines computational efficiency with ease of use. Users can begin analyzing their own data with just a few lines of code, while those with more experience in the field can leverage the modular design to develop new metrics.
An introduction to graph-tool and statistical inference using minimum description length principle
by Erik Weis
Graph-tool is a powerful software package for performing various network analysis tasks, including network community detection and network inference. While this family of algorithms can be used out of the box with minimal coding effort, scientific application of these tools requires a more thorough understanding of how they work. The first goal of this tutorial is to provide a basic intuition for their conceptual core of graph-tool methods—i.e., the minimum description length (MDL) principle. Then, we will survey different algorithmic variants including community detection for weighted graphs, multilayer graphs, bipartite graphs, or sequence data and network reconstruction from noisy data or time series. Finally, we will cover the various approaches to statistical inference, including a brief overview of inference algorithms and their hyperparameters. Attendees should leave with a basic intuition for how to justify their use of MDL-based methods in scientific research, as well as a practical understanding of how to use graph-tool for such purposes.
Manipulating temporal networks using NetworkX-Temporal
by Nelson Aloysio Reis De Almeida Passos
NetworkX-Temporal is a Python library designed for building and manipulating temporal graphs. It implements functions to slice, transform, and convert dynamic graph data to different formats and libraries, allowing researchers and practitioners to employ it for various analysis and exploration tasks. This tutorial aims to provide a practical introduction with hands-on examples, including calculating node centrality and graph centralization, drawing and visualizing graphs, and detecting and tracking communities over time. A basic understanding of coding in Python is assumed, and prior experience with network analysis in Python is beneficial, but not mandatory.