注册 | 登录读书好,好读书,读好书!
读书网-DuShu.com
当前位置: 首页出版图书科学技术计算机/网络软件工程及软件方法学基于开源工具的数据分析(影印版)

基于开源工具的数据分析(影印版)

基于开源工具的数据分析(影印版)

定 价:¥82.00

作 者: (美)雅奈特 著
出版社: 东南大学出版社
丛编项:
标 签: 软件工程/开发项目管理

购买这本书可以去


ISBN: 9787564126742 出版时间: 2011-05-01 包装: 平装
开本: 16开 页数: 509 字数:  

内容简介

  数据收集相对比较简单,而要把原始信息转化为有用的数据则需要知道如何精确地抽取你想要的内容。通过《基于开源工具的数据分析(影印版·英文版)》的深入讲解,那些对数据分析感兴趣的中等或者富有经验的程序员将可以学习到在商业环境中与数据打交道的技术。你将了解到如何观察数据来找出它所包含的信息,如何在概念模型里捕捉到这些想法,然后把你的理解通过商业计划、度量标准的精确报告和其他方式反馈给你所在的机构。你将会通过《基于开源工具的数据分析(影印版·英文版)》每章结束部分的动手实践来慢慢体验各种概念。最重要的是,你将了解到如何思考你所希望获取的数据:而不是依赖于工具来替你思考。

作者简介

  PhilippK.Janert目前提供数据分析和数学模型的咨询服务,他曾经是物理学家和软件工程师。他是《Gnuplot inAction:Understanding Data with Graphs》(Manning出版)的作者,他为O’ReillvNetwork.IBMdeVeloperWorks和IEEEsoftware写过文章。他拥有Washington大学理论物理学的博士学位。

图书目录

PREFACE
1 INTRODUCTION
 Data Analysis
 What's in This Book
 What's with the Workshops?
 What's with the Math?
 What You'll Need
 What's Missing
PART I Graphics: Looking at Data
2 A SINGLE VARIABLE: SHAPE AND DISTRIBUTION
 Dot andJitter Plots
 Histograms and Kernel Density Estimates
 The Cumu/atiue Distribution Function
 Rank-Order Plots and Lilt Charts
 Only When Appropriate: Summary Statistics and Box Plots
 Workshop: NumPy
 Further Reading
3 TWO VARIABLES: ESTABLISHING RELATIONSHIPS
 Scatter Plots
 Conquering Noise: 5moothing
 Logarithmic Plots
 Banking
 Linear ReRression and All That
 Shouwing What's Important
 Graphical Analysis and Presentation Graphics
 Workshop: matplotlib
 Further Reading
 TIME AS A VARIABLE: TIME-SERIES ANALYSIS
 Examples
 The Task
 Smoothing
 Don't Ouerlook the Obuious!
 The Correlation Function
 Optional: Filters and Conuolutions
 Workshop: scipy.signal
 Further ReadinR
5 MORE THAN TWO VARIABLES: GRAPHICAL MULTIVARIATE ANALYSIS
 False-Color Plots
 A Lot at a Glance: Multiplots
 Composition Problems
 Nouel Plot Types
 Interactiue Explorations
 Workshop: Tools for Multiuariate Graphics
 Further ReadinR
6 INTERMEZZO: A DATA ANALYSIS SESSION
 A Data Analysis Session
 Workshop: gnuplot
 Further ReadinR
PART II Analyticg: Modeling Data
7 GUESSTIMATION AND THE BACK OF THE ENVELOPE
 Principles of Guesstimation
 How Good Are Those Numbers?
 Optional: A Closer Look at Perturbation Theory and
 Error PropaRation
 Workshop: The Gnu Scientific Library (GSL)
 Further Reading
8 MODELS FROM SCALING ARGUMENTS
 Models
 ArRuments from Scale
 Mean-Field Approximations
 Common Time-Euolution Scenarios
 Case Study: How Many Seruers Are Best?
 Why Modeling?
 Workshop: Sage
 Further Reading
9 ARGUMENTS FROM PROBABILITY MODELS
 The. Binomial Distribution and Bernoulli Trials
 The Gaussian Distribution and the Central Limit Theorem
 Power-Law Distributions and Non-Normal Statistics
 Other Distributions
 Optional: Case Study--Unique Visitors ouer Time
 Workshop: Power-Law Distributions
 Further Reading
10 WHAT YOU REALLY NEED TO KNOW ABOUT CLASSICAL STATISTICS
 Genesis
 Statistics Defined
 Statistics Explained
 Controlled Experiments Versus Obseruationa} Studies
 Optional: Bayesian Statistics--The Other Point of View
 Workshop: R
 Further Reading
11 INTERMEZZO:MYTHBUSTING--BIGFOOT, LEAST SQUARES, AND ALLTHAT
 How to Auerage Auerages
 The Standard Deuiation
 Least Squares
 Further Reading
PART III Computation: Mininhg Data
12 SIMULATIONS
 A Warm-Up Question
 Monte Carlo Simulations
 Resampling Methods
 Workshop: Discrete Euent Simulations with Simpy
 Further Reading
13 FINDING CLUSTERS
 What Constitutes a Cluster?
 Distance and Similarity Measures
 Clustering Methods
 Pre-and Postprocessing
 Other ThouRhts
 A Special Case: Market BasketAnalysis
 A Word of WarninR
 Workshop: P/cluster and the C Clustering Library
 Further Reading
14 SEEING THE FOREST FOR THE TREES: FINDING
 IMPORTANT ATTRIBUTES
 Principal Component Analysis
 Visual Techniques
 Kohonen Maps
 Workshop: PCA with R
 Further Readin2
15 INTERMEZZO:WHEN MORE IS DIFFERENT
 A Horror Story
 Some Suggestions
 What About Map/Reduce?
 Workshop: Generating Permutations
 Further ReadingPART IV Applications: Using Data
16 REPORTING, BUSINESS INTELLIGENCE, AND DASHBOARDS
 Business Intelligence
 Corporate Metrics and Dashboards
 Data Quality Issues
 Workshop: Berkeley DB and SQLite
 Further Reading
17 FINANCIAL CALCULATIONS AND MODELING
 The Time Value o[ Money
 Uncertainty in Planning and Opportunity Costs
 Cost Concepts and Depreciation
 Should You Care?
 Is This All That Matters?
 Workshop: The Newsuendor Problem
 Further Reading
18 PREDICTIVE ANALYTICS
 Introduction
 Some Classification Terminology
 Algorithms for Classification
 The Process
 The Secret Sauce
 The Nature o[ Statistical Learning
 Workshop: Two Do-lt-Yoursel Classifiers
 Further Reading
19 EPILOGUE: FACTS ARE NOT REALITY
 A PROGRAMMING ENVIRONMENTS FOR SCIENTIFIC COMPUTATION
 AND DATA ANALYSIS
 Software Tools
 A Catalog of Scientific Software
 Writing Your Own
 Further Reading
 B RESULTS FROM CALCULUS
 Common Functions
 Calculus
 Useful Tricks
  Notation and Basic Math
  Where to Go from Here
  Further Readin9
  WORKING WITH DATA
  Sources for Data
  Cleanin9 and ConditioninR
  Sarnplin9
  Data File Formats
  The Care and Feeding of Your Data Zoo
  Skills
  Terminology
  Further Fleadin9
INDEX

本目录推荐