Contents
Biography . .iv
Preface. v
PART 1INTRODUCTION
CHAPTER 1Statistical Machine Learning
1.1Types of Learning 3
1.2Examples of Machine Learning Tasks . 4
1.2.1Supervised Learning 4
1.2.2Unsupervised Learning . 5
1.2.3Further Topics 6
1.3Structure of This Textbook . 8
PART 2STATISTICS AND PROBABILITY
CHAPTER 2Random Variables and Probability Distributions
2.1Mathematical Preliminaries . 11
2.2Probability . 13
2.3Random Variable and Probability Distribution 14
2.4Properties of Probability Distributions 16
2.4.1Expectation, Median, and Mode . 16
2.4.2Variance and Standard Deviation 18
2.4.3Skewness, Kurtosis, and Moments 19
2.5Transformation of Random Variables 22
CHAPTER 3Examples of Discrete Probability Distributions
3.1Discrete Uniform Distribution . 25
3.2Binomial Distribution . 26
3.3Hypergeometric Distribution. 27
3.4Poisson Distribution . 31
3.5Negative Binomial Distribution . 33
3.6Geometric Distribution 35
CHAPTER 4Examples of Continuous Probability Distributions
4.1Continuous Uniform Distribution . 37
4.2Normal Distribution 37
4.3Gamma Distribution, Exponential Distribution, and Chi-Squared Distribution . 41
4.4Beta Distribution . 44
4.5Cauchy Distribution and Laplace Distribution 47
4.6t-Distribution and F-Distribution . 49
CHAPTER 5Multidimensional Probability Distributions
5.1Joint Probability Distribution 51
5.2Conditional Probability Distribution . 52
5.3Contingency Table 53
5.4Bayes’ Theorem. 53
5.5Covariance and Correlation 55
5.6Independence . 56
CHAPTER 6Examples of Multidimensional Probability Distributions61
6.1Multinomial Distribution . 61
6.2Multivariate Normal Distribution . 62
6.3Dirichlet Distribution 63
6.4Wishart Distribution . 70
CHAPTER 7Sum of Independent Random Variables
7.1Convolution 73
7.2Reproductive Property 74
7.3Law of Large Numbers 74
7.4Central Limit Theorem 77
CHAPTER 8Probability Inequalities
8.1Union Bound 81
8.2Inequalities for Probabilities 82
8.2.1Markov’s Inequality and Chernoff’s Inequality 82
8.2.2Cantelli’s Inequality and Chebyshev’s Inequality 83
8.3Inequalities for Expectation . 84
8.3.1Jensen’s Inequality 84
8.3.2H?lder’s Inequality and Schwarz’s Inequality . 85
8.3.3Minkowski’s Inequality . 86
8.3.4Kantorovich’s Inequality . 87
8.4Inequalities for the Sum of Independent Random Vari-ables 87
8.4.1Chebyshev’s Inequality and Chernoff’s Inequality 88
8.4.2Hoeffding’s Inequality and Bernstein’s Inequality 88
8.4.3Bennett’s Inequality. 89
CHAPTER 9Statistical Estimation
9.1Fundamentals of Statistical Estimation 91
9.2Point Estimation 92
9.2.1Parametric Density Estimation . 92
9.2.2Nonparametric Density Estimation 93
9.2.3Regression and Classification. 93
9.2.4Model Selection 94
9.3Interval Estimation. 95
9.3.1Interval Estimation for Expectation of Normal Samples. 95
9.3.2Bootstrap Confidence Interval 96
9.3.3Bayesian Credible Interval. 97
CHAPTER 10Hypothesis Testing
10.1Fundamentals of Hypothesis Testing 99
10.2Test for Expectation of Normal Samples 100
10.3Neyman-Pearson Lemma . 101
10.4Test for Contingency Tables 102
10.5Test for Difference in Expectations of Normal Samples 104
10.5.1 Two Samples without Correspondence . 104
10.5.2 Two Samples with Correspondence 105
10.6Nonparametric Test for Ranks. 107
10.6.1 Two Samples without Correspondence . 107
10.6.2 Two Samples with Correspondence 108
10.7Monte Carlo Test . 108
PART 3GENERATIVE APPROACH TO STATISTICAL PATTERN RECOGNITION
CHAPTER 11Pattern Recognition via Generative Model Estimation113
11.1Formulation of Pattern Recognition . 113
11.2Statistical Pattern Recognition . 115
11.3Criteria for Classifier Training . 117
11.3.1 MAP Rule 117
11.3.2 Minimum Misclassification Rate Rule 118
11.3.3 Bayes Decision Rule 119
11.3.4 Discussion . 121
11.4Generative and Discriminative Approaches 121
CHAPTER 12Maximum Likelihood Estimation
12.1Definition. 123
12.2Gaussian Model. 125
12.3Computing the Class-Posterior Probability . 127
12.4Fisher’s Linear Discriminant Analysis (FDA