Talk Title: Matrix-based Learning Algorithms for Data Mining and
Bioinformatics
Speaker: Dr. Chris Ding from Lawrence Berkeley National Laboratory
Talk abstract:
Matrix-based data mining and statistical learning is going through a Renaissance
period with many new developments. We describe several major advances in the
area. We show that Principal Component Analysis (PCA) provides solutions to
K-means clustering, thus connecting dimension reduction to clustering, two
fundamental aspects of unsupervised learning.
We describe the state-of-art Laplacian matrix based spectral clustering and
their effectiveness results from a self-aggregation property due to the
nonlinear mapping. We describe their applications in bioinformatics and social
sciences. These advances pave the way to establish a matrix factorization based
learning framework, a new powerful direction in data mining. They benefit
significantly from matrix knowledge accumulated over centuries and the
successful developments of scientific
and engineering computing of the last 30 years. We also describe large scale
data mining on distributed computers and over the Grid.
Short bio:
Chris Ding is a staff computer scientist at Lawrence Berkeley National
Laboratory. He received a Ph.D. from Columbia University and did research at
California Institute of Technology and Jet Propulsion Laboratory. His research
focuses on bioinformatics and machine learning / data mining. He develops
efficient graph algorithms using matrix computation.
More information about him can be found at http://crd.lbl.gov/~cding.