# Scientific

## Learning Tasks in the Wasserstein Space

Detecting differences and building classifiers between distributions, given only finite samples, are important tasks in a number of scientific fields. Optimal transport (OT) has evolved as the most natural concept to measure the distance between distributions and has gained significant importance in machine learning in recent years. There are some drawbacks to OT: computing OT can be slow, and it often fails to exploit reduced complexity in case the family of distributions is generated by simple group actions.

If we make no assumptions on the family of distributions, these drawbacks are difficult to overcome. However, in the case that the measures are generated by push-forwards by elementary transformations, forming a low-dimensional submanifold of the Wasserstein manifold, we can deal with both of these issues on a theoretical and on a computational level. In this talk, we’ll show how to embed the space of distributions into a Hilbert space via linearized optimal transport (LOT), and how linear techniques can be used to classify different families of distributions generated by elementary transformations and perturbations. The proposed framework significantly reduces both the computational effort and the required training data in supervised learning settings. We demonstrate the algorithms in pattern recognition tasks in imaging and provide some medical applications.

This is joint work with Alex Cloninger, Keaton Hamm, Harish Kannan, Varun Khurana, and Jinjie Zhang.

## Effect of Dependence on the Convergence of Empirical Wasserstein Distance

The Wasserstein distance is a powerful tool in modern machine learning to metrize the space of probability distributions in a way that takes into account the geometry of the domain. Therefore, a lot of attention has been devoted in the literature to understanding rates of convergence for Wasserstein distances based on i.i.d data. However, often in machine learning applications, especially in reinforcement learning, object tracking, performative prediction, and other online learning problems, observations are received sequentially, rendering some inherent temporal dependence. Motivated by this observation, we attempt to understand the problem of estimating Wasserstein distances using the natural plug-in estimator based on stationary beta-mixing sequences, a widely used assumption in the study of dependent processes. Our rates of convergence results are applicable under both short and long-range dependence. As expected, under short-range dependence, the rates match those observed in the i.i.d. case. Interestingly, however, even under long-range dependence, we can show that the rates can match those in the i.i.d. case provided the (intrinsic) dimension is large enough. Our analysis establishes a non-trivial trade-off between the degree of dependence and the complexity of certain function classes on the domain. The key technique in our proofs is a blend of the big-block-small-block method coupled with Berbee’s lemma and chaining arguments for suprema of empirical processes.

## Influence of the endothelial surface layer on the motion of red blood cells

The endothelial lining of blood vessels presents a large surface area for exchanging materials between blood and tissues. The endothelial surface layer (ESL) plays a critical role in regulating vascular permeability, hindering leukocyte adhesion as well as inhibiting coagulation during inflammation. Changes in the ESL structure are believed to cause vascular hyperpermeability and induce thrombus formation during sepsis. In addition, ESL topography is relevant for the interactions between red blood cells (RBCs) and the vessel wall, including the wall-induced migration of RBCs and formation of a cell-free layer. To investigate the influence of the ESL on the motion of RBCs, we construct two models to represent the ESL using the immersed boundary method in two dimensions. In particular, we use simulations to study how lift force and drag force change over time when a RBC is placed close to the ESL as thethickness, spatial variation, and permeability of the ESL vary. We find that spatial variation has a significant effect on the wall-induced migration of the RBC when the ESL is highly permeable and that the wall-induced migration can be significantly inhibited by the presence of a thick ESL.

## Siegel-Veech transform

In this talk, I will talk about the Siegel-Veech transform, and how it can be used to count the number of cylinders (of bounded length) on a translation surface. This counting result relies amongst other tools, on the ergodicity of the SL(2, R) action on the moduli space of translation surfaces. This talk will not assume prior knowledge of translation surfaces, and most of the techniques used will be techniques coming from homogeneous dynamics.

- Read more about Siegel-Veech transform
- 140 reads

## Pointwise ergodic theorem along a subsequence of integers

After Birkhoff’s Pointwise Ergodic Theorem was proved in 1931, there have been many attempts to generalize the theorem along a subsequence of the integers instead of taking the entire se- quence (n). In this talk, we will present the following result of Roger Jones and Máté Wierdl:

If a sequence (an) satisfies an+1/an ≥ +1 + 1/(log n)12−ε , for some ε > 0, then in any aperiodic dynamical system (X, Σ, μ, T), we can always find a function f ∈ L2 such that the Cesàro averages along the se- quence (an) which is defined by An∈[N] f (Tan x) := N1 ∑ f (Tan x) (0.1) n∈[N] fail to converge in a set of positive measure.

## Agent-based models: from bacterial aggregation to wealth hot-spots

Agent-based models are widely used in numerous applications. They have an advantage of being easy to formulate and to implement on a computer. On the other hand, to get any mathematical insight (motivated by, but going beyond computer simulations) often requires looking at the continuum limit where the number of agents becomes large. In this talk I give several examples of agent- based modelling, including bacterial aggregation, spatio-temporal SIR model, and wealth hotspots in society; starting from their derivation to taking their continuum limit, to analysis of the resulting continuum equations.

## On vertex-transitive graphs with a unique hamiltonian circle

We will discuss graphs that have a unique hamiltonian cycle and are vertex-transitive, which means there is an automorphism that takes any vertex to any other vertex. Cycles are the only examples with finitely many vertices, but the situation is more interesting for infinite graphs. (Infinite graphs do not have "hamiltonian cycles," but there are natural analogues.) The case where the graph has only finitely many ends is not difficult, but we do not know whether there are examples with infinitely many ends. This is joint work in progress with Bobby Miraftab.

## Random plane geometry -- a gentle introduction

Consider Z^2, and assign a random length of 1 or 2 to each edge based on independent fair coin tosses. The resulting random geometry, first passage percloation, is conjectured to have a scaling limit. Most random plane geometric models (including hidden geometries) should have the same scaling limit. I will explain the basics of the limiting geometry, the "directed landscape", the central object in the class of models named after Kardar, Parisi and Zhang.

## Primes, postdocs and pretentiousness

Reflections on the research developments that have contributed to this award, mostly to do with the distribution of primes and multiplicative functions, discussing my research team's contributions, and the possible future for several of these questions.

## Height gaps for coefficients of D-finite power series

A power series $f(x_1,\ldots,x_m)\in \mathbb{C}[[x_1,\ldots,x_m]]$ is said to be D-finite if all the partial derivatives of $f$ span a finite dimensional vector space over the field $\mathbb{C}(x_1,\ldots,x_m)$. For the univariate series $f(x)=\sum a_nx^n$, this is equivalent to the condition that the sequence $(a_n)$ is P-recursive meaning a non-trivial linear recurrence relation of the form:

$$P_d(n)a_{n+d}+\cdots+P_0(n)a_n=0$$ where the $P_i$'s are polynomials. In this talk, we consider D-finite power series with algebraic coefficients and discuss the growth of the Weil height of these coefficients. This is from a joint work with Jason Bell and Umberto Zannier in 2019 and a more recent work in June 2022.