Brains, Minds and Machines Seminar: Modal-Set Estimation using kNN graphs, and Applications to Clustering
Friday, October 12, 2018 at 4:00pm to 5:00pm
Brain and Cognitive Sciences Complex, Singleton Auditorium (MIT 46-3002) 43 Vassar Street, Cambridge MA 02139
Speaker: Prof. Samory Kpotufe, Princeton University
Abstract: Estimating the mode or modal-sets (i.e. extrema points or surfaces) of an unknown density from sample is a basic problem in data analysis. Such estimation is relevant to other problems such as clustering, outlier detection, or can simply serve to identify low-dimensional structures in high dimensional-data (e.g. point-cloud data from medical-imaging, astronomy, etc). Theoretical work on mode-estimation has largely concentrated on understanding its statistical difficulty, while less attention has been given to implementable procedures. Thus, theoretical estimators, which are often statistically optimal, are for the most part hard to implement. Furthermore for more general modal-sets (general extrema of any dimension and shape) much less is known, although various existing procedures (e.g. for manifold-denoising or density-ridge estimation) have similar practical aim. I’ll present two related contributions of independent interest: (1) practical estimators of modal-sets – based on particular subgraphs of a k-NN graph – which attain minimax-optimal rates under surprisingly general distributional conditions; (2) high-probability finite sample rates for k-NN density estimation which is at the heart of our analysis. Finally, I’ll discuss recent successful work towards the deployment of these modal-sets estimators for various clustering applications.