Geometric Statistics for High-Dimensional Data Analysis
ABSTRACT: We present a scheme of studying the geometry of high-dimensional data to discover patterns in it, using minimal parametric distributional assumptions. Our approach is to define multivariate quantiles and extremes, and develop a method of center-outward partial ordering of observations. We formulate methods for quantifying relationships among observed variables, thus generalizing the notions of regression and principal components. We propose tests for linear relations between variables in many dimensions using the geometric properties of the data, thus paving a way for checking whether recent developments involving Gaussian assumption or sparsity of relations are applicable. We devise geometric algorithms for detection of outliers in high dimensions, classification and supervised learning. Examples on the use the proposed methods will be provided. This is joint work with several students.
BIO: Dr. Ansu Chatterjee is tenured Associate Professor at the School of Statistics, and the Director of the Institute for Research in Statistics and its Applications (IRSA, http://irsa.stat.umn.edu/). His research interests include Data Science theory and methods and Big Data applications to climate sciences, networks and graphs, neuroimaging and social sciences. He has published in the Annals of Statistics, Annals of Applied Statistics, Annals of the Institute of Statistical Mathematics, Bioinformatics and other top Statistics journals, and in peer-reviewedComputer Science conferences in machine learning and data mining.