Preface
1.Introduction: Data-Analytic Thinking
The Ubiquity of Data Opportunities
Example: Hurricane Frances
Example: Predicting Customer Churn
Data Science, Engineering, and Data-Driven Decision Making
Data Processing and "Big Data"
From Big Data 1.0 to Big Data 2.0
Data and Data Science Capability as a Strategic Asset
Data-Analytic Thinking
This Book
Data Mining and Data Science, Revisited
Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
Summary
2.Business Problems and Data Science Solutions
From Business Problems to Data Mining Tasks
Supervised Versus Unsupervised Methods
Data Mining and Its Results
The Data Mining Process
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Implications for Managing the Data Science Team
Other Analytics Techniques and Technologies
Statistics
Database Querying
Data Warehousing
Regression Analysis
Machine Learning and Data Mining
Answering Business Questions with These Techniques
Summary
3.Introduction to Predictive Modeling: From Correlation to Supervised Segmentation.
Models, Induction, and Prediction
Supervised Segmentation
Selecting Informative Attributes
Example: Attribute Selection with Information Gain
Supervised Segmentation with Tree-Structured Models
Visualizing Segmentations
Trees as Sets of Rules
Probability Estimation
Example: Addressing the Churn Problem with Tree Induction
Summary
4.Fitting a Model to Data
Classification via Mathematical Functions
Linear Discriminant Functions
Optimizing an Objective Function
An Example of Mining a Linear Discriminant from Data
Linear Discriminant Functions for Scoring and Ranking Instances
Support Vector Machines, Briefly
Regression via Mathematical Functions
Class Probability Estimation and Logistic "Regression"
Logistic Regression: Some Technical Details
Example: Logistic Regression versus Tree Induction
Nonlinear Functions, Support Vector Machines, and Neural Networks
5.Overfitting and Its Avoidance
6.Similarity, Neighbors, and Clusters
7.Decision AnalyticThinking h What Is a Good Model?
8.Visualizing Model Performance
9.Evidence and Probabilities
10.Representing and Mining Text
11.Decision Analytic Thinking Ih Toward Analytical Engineering
12.Other Data Science Tasks and Techniques
13.Data Science and Business Strategy
14.Conclusion
A.Proposal ReviewGuide
B.Another Sample Proposal
Glossary
Bibliography
Index