References

This page provides references for classic papers and books in various field of machine learning, big data analysis. For the beginner and new students, it provides foundations for your own research. For senior students, we expect you to get inspired by previous work of other pioneers in your field.

Machine Learning

For an overview:
- Hastie, Tibshirani, Friedman; Elements of Statistical Learning.
- Duda, Hart, Stork; Pattern Recognition.
- Wasserman; All of Statistics: A Concise Course in Statistical Inference
- MacKay; Information Theory, Inference, and Learning Algorithms
More theoretical material:
- Devroye, Györfi, Lugosi; A Probabilistic Theory of Pattern Recognition.
- Mohri, Rostamizadeh, Talwalkar; Foundations of Machine Learning (Adaptive Computation and Machine Learning series)
- Scott; Lecture notes from Prof. Scott
- Lugosi, Massart, Boucheron; Concentration Inequalities: A Nonasymptotic Theory of Independence
Penalized estimation:
- Liu, Roeder, Wasserman; Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

Learning from multiple sources

Multi-view semi-supervised learning:
- Xu, Tao, Xu; A survey of multi-view learning
- Crammer, Keams, Wortman; Learning from Multiple Sources
Multi-task learning:
- Argyriou, Evgeniou, and Pontil; Convex multi-task feature learning

Random Geometric Graphs and Networks

(Generalized) BHH theorem and application:
Percolation theory:
- Penrose; Random Geometric Graphs
Random network theory:

Differential Geometry in statistics, information theory and learning

J. Manton http://arxiv.org/abs/1302.0430, A Primer on Stochastic Differential Geometry for Signal Processing
Amari, Nagaoka; Methods of Information Geometry (Translations of Mathematical Monographs) (Tanslations of Mathematical Monographs) Classic in information geometry (need to know differential geometry first)
Murata, Takenouchi, Kanamori, Eguchi; Information geometry of U-Boost and Bregman divergence

Information Divergence Estimation and Applications

Graph-Based Approaches
- Henze, Penrose; On the multivariate runs test
K-NN Methods
- Moon, Hero; Ensemble estimation of multivariate f-divergence
- Moon, Hero; Multivariate f-divergence estimation with confidence
KDE Plug-in Methods
- Moon, Sricharan, Greenewald, Hero; Improving convergence of divergence functional ensemble estimators
- Kandasamy, Krishnamurthy, Poczos, Wasserman, Robins; Nonparametric von Mises Estimators for Entropies, Divergences and Mutual Informations
- Sing, Poczos; Exponential Concentration of a Density Functional Estimator
Other Methods
- Nguyen, Wainwright, Jordan; Estimating divergence functionals and the likelihood ratio by convex risk minimization
Bayes Error Bounds
- Berisha, Wisler, Hero, Spanias; Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure
- Moon, Delouille, Hero; Meta learning of bounds on the Bayes classifier error

Target Detection/Tracking/Localization

Localization in wireless sensor networks:
- Costa, Patwari, and Hero; Distributed weighted-multidimensional scaling with adaptive weighting for node localization in sensor networks
- Rangarajan, Raich, and Hero; Sparse multidimensional scaling for blind tracking in sensor networks
An overview of tracking algorithms, including Kalman filters, extensions to Kalman filters, and particle filters:
- Arulampalam, et. al.; A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking
An overview on linearization of the particle filter proposal density:
- Doucet, et. al.; On sequential Monte Carlo sampling methods for Bayesian filtering
Multiple target tracking using particle filters:
- Kreucher, Kastella, and Hero; Multitarget Tracking using the Joint Multitarget Probability Density

Adaptive Sensing

Bashan, Raich, and Hero; Optimal two-stage search for sparse targets using convex criteria
Chong, Kreucher, and Hero; Partially Observable Markov Decision Process Approximations for Adaptive Sensing
Hero, Kreucher, and Blatt; “Information theoretic approaches to sensor management”, Ch.3 in Foundations and Applications of Sensor Management

LaTeX tools

* TIKZ and PGF for drawing within LaTeX slides

References

Machine Learning

General Surveys

Spectral Methods

Dimensionality Estimation

Online learning and Boosting

Sparse coding, dictionary learning and matrix factorization

Deep learning, neural network, feature learning