## Colloquium

To receive schedule updates by email, contact the Colloquium Chair, Prof. Marius Beceanu.

The colloquium usually meets on Friday at 3:00 pm in room ES-143.

The colloquium is partially supported from the Simons grant #584738 (Cristian Lenart)

Yiming Ying (University at Albany)

Online AUC Maximization in Machine Learning

Friday, September 6, 2019

3:00 p.m. in ES-143

Abstract: Stochastic optimization algorithms such as stochastic (online) gradient descent (SGD) update the model sequentially with cheap per-iteration costs, making them amenable for big streaming data analysis. However, most of the existing studies focus on the classification accuracy which can not be directly applied to the important problem of maximizing the Area under the ROC curve (AUC) in imbalanced classification and bipartite ranking. In this talk, I will present our recent work on developing novel SGD-type algorithms for AUC maximization. Our new algorithms can process the streaming data in a sequential manner, which are achieved through innovative interactions between machine learning and applied mathematics. Compared with the existing work which requires high storage and per-iteration costs, our algorithms have both space and per-iteration costs of one datum while achieving optimal convergence rates.

**Distinguished Lecture:** Vyjayanthi Chari (UC Riverside)

Cluster Algebras and Monoidal Categorification

Friday, October 25, 2019

3:00 p.m. in LC-20

Abstract: Cluster Algebras were introduced by Fomin and Zelevinsky at the turn of the century to understand questions on total positivity in Lie theory. The subject is a now a very active area of research and has many interesting connections and unexpected applications in geometry, combinatorics, and representation theory. In 2009, Hernandez and Leclerc introduced the notion of monoidal categorification of cluster algebras; namely realizing the cluster algebra as the Grothendieck ring of a certain tensor category. In this talk we shall first give a gentle introduction to these ideas. We then discuss some recent work with Matheus Brito which makes further connections with Macdonald polynomials and Demazure modules.

Alexander Powell (Vanderbilt University)

Low-bit representations of overcomplete signal expansions and neural networks

Friday, November 1, 2019

3:00 p.m. in ES-143

Abstract: We consider mathematical aspects of how to digitally represent information. Redundancy or oversampling is an important ingredient in many types of signal representations. For example, in the classical Shannon sampling theorem, redundancy provides design flexibility and robustness against noisy measurements. We shall discuss analog-to-digital conversion for redundant signal representations. We show error bounds which quantify how well different quantization methods, such as consistent reconstruction and Sigma-Delta quantization, utilize redundancy. Lastly, we discuss the problem of training neural networks with low-bit weights; we consider an approach based on stochastic Markov gradient descent (SMGD) and prove that the method performs well both theoretically and numerically.

Rongwei Yang (University at Albany)

Some Aspects of Projective Spectrum

Friday, November 15, 2019

3:00 p.m. in ES-143

Abstract: Finitely generated structures are important subjects of study in various mathematical disciplines. Examples include finitely generated groups, finitely generated Lie algebras and $C^*$-algebras, tuples of several linear operators, etc.. It is thus a fundamental question whether there exists a universal mechanism in the study of these vastly different entities. In 2009, the notion of projective spectrum for several elements $A_0, A_1, \ldots, A_n$ in a unital Banach algebra B was defined through the multiparameter pencil $A(z) = z_0A_0 +z_1A_1 +\cdots+z_nA_n$, where the coefficients $z_j$ are complex numbers. This conspicuously simple definition turned out to have a surprisingly rich content. In the first part of the talk we will review some results in finite dimensions. The central topic here is the characteristic polynomial for several matrices and its use in finite group and Lie algebras. The second part is to review some results in infinite dimension relating to complex dynamics, complex geometry and operator theory. Talk is slow and student-friendly.

Suvrit Sra (MIT)

ReLU nets are powerful memorizers: a tight analysis of finite sample expressive power

Friday, December 6, 2019

3:00 p.m. in ES-143

Abstract: I will talk about finite sample expressivity, aka memorization power of ReLU networks. Recent results showed (unsurprisingly) that arbitrary input data could be perfectly memorized using a shallow ReLU network with one hidden layer having N hidden nodes. I will describe a more careful construction that trades of width with depth to show that a ReLU network with 2 hidden layers, each with 2*sqrt(N) hidden nodes, can perfectly memorize arbitrary datasets. Moreover, we prove that width of Θ(sqrt(N)) is necessary and sufficient for having perfect memorization power. A notable corollary of this result is that mild overparametrization suffices to permit a NN to achieve zero training loss! We extend our results to deep networks too and combined with recent results on VC-dimension of deep nets, we show that our results on memorization power are nearly tight. Time permitting, I will mention expressive power results for Resnets, as well as how SGD behaves on optimizing such networks.