Preface
Part I. Basics
1. Getting Started
Introduction
Other Tools
Setting Up Your Environment
Prerequisites
Starting Apache Spark
Checking Out the Code
Getting Familiar with Apache Spark
Starting Apache Spark with Spark NLP
Loading and Viewing Data in Apache Spark
Hello World with Spark NLP
2. Natural Language Basics
What Is Natural Language?
Origins of Language
Spoken Language Versus Written Language
Linguistics
Phonetics and Phonology
Morphology
Syntax
Semantics
Sociolinguistics: Dialects, Registers, and Other Varieties
Formality
Context
Pragmatics
Roman ]akobson
How To Use Pragmatics
Writing Systems
Origins
Alphabets
Abiads
Abugidas
Syllabaries
Logographs
Encodings
ASCII
Unicode
UTF-8
Exercises: Tokenizing
Tokenize English
Tokenize Greek
Tokenize Ge'ez (Amharic)
Resources
3. NLP on Apache Spark
Parallelism, Concurrency, Distributing Computation
Parallelization Before Apache Hadoop
MapReduce and Apache Hadoop
Apache Spark
Architecture of Apache Spark
Physical Architecture
Logical Architecture
Spark SQL and Spark MLlib
Transformers
Estimators and Models
Evaluators
NLP Libraries
Functionality Libraries
Annotation Libraries
NLP in Other Libraries
Spark NLP
Annotation Library
Stages
Pretrained Pipelines
Finisher
Exercises: Build a Topic Model
Resources
4. Deep Learning Basics
Gradient Descent
Backpropagation
Convolutional Neural Networks
Filters
Pooling
Recurrent Neural Networks
Backpropagation Through Time
Elman Nets
LSTMs
Exercise 1
Exercise 2
Resources
Part II. Building Blocks
5. Processing Words
6. Information Retrieval
7. Classification and Regression
8. Sequence Modeling with Keras
9. Information Extraction
10. Topic Modeling
11. Word Embeddings
Part III. Applications
12. Sentiment Analysis and Emotion Detection
13. Building Knowledqe Bases
14. Search Engine
15. Chatbot
16. Object Character Recognition
Part IV. Building NLP Systems
17. Supporting Multiple Languages
18. Human Labeling
19. Productionizing NLP Applications
Glossary
Index