书籍详情

Spark NLP自然语言处理(影印版)

Spark NLP自然语言处理(影印版)

作者:AlexThomas 著

出版社:东南大学出版社

出版时间:2021-07-01

ISBN:9787564195113

定价:¥132.00

购买这本书可以去
内容简介
  如果你想构建一款使用自然语言文本的企业级应用,但不确定从哪里着手或者该使用什么工具,这本实用指南可以助你一臂之力。Wisecube首席数据科学家Alex Thomas向软件工程师和数据科学家们展示了如何使用深度学习和Apache Spark NLP库构建可扩展的自然语言处理(Natural Language Processing,NLP)应用。通过具体的示例、实践和理论解释,以及在Spark处理框架上使用NLP进行的动手练习,本书将教授你从基本语言学和书写系统到情感分析和搜索引擎的一切。除此之外,你还将探究开发基于文本的应用时要特别注意的性能等问题。在本书的四个部分中,你将学习到NLP基础知识和基本构成要素,然后再深入研究应用和系统构建:基础:理解自然语言处理、Apache Stark上的NLP及深度学习的基础知识。基本构成要素:学习包括标记化、句子分割和命名实体识别在内的NLP应用构建技术,知晓其工作方式及工作原理。应用:探究构建你自己的NLP应用所涉及的设计、开发和实验过程。构建NLP系统:考虑生产和部署NLP模型的备选方案,包括支持哪些人类语言。
作者简介
  亚历克斯·托马斯是Wisecube的首席数据科学家。他将自然语言处理和机器学习运用于临床数据、身份数据、雇主和求职者数据以及如今的生化数据。Alex从09版本开始使用Apache Spark,在工作中也用过包括UIMA和OpenNLP在内的多种NLP库和框架。
目录
Preface
Part I. Basics
1. Getting Started
Introduction
Other Tools
Setting Up Your Environment
Prerequisites
Starting Apache Spark
Checking Out the Code
Getting Familiar with Apache Spark
Starting Apache Spark with Spark NLP
Loading and Viewing Data in Apache Spark
Hello World with Spark NLP
2. Natural Language Basics
What Is Natural Language?
Origins of Language
Spoken Language Versus Written Language
Linguistics
Phonetics and Phonology
Morphology
Syntax
Semantics
Sociolinguistics: Dialects, Registers, and Other Varieties
Formality
Context
Pragmatics
Roman ]akobson
How To Use Pragmatics
Writing Systems
Origins
Alphabets
Abiads
Abugidas
Syllabaries
Logographs
Encodings
ASCII
Unicode
UTF-8
Exercises: Tokenizing
Tokenize English
Tokenize Greek
Tokenize Ge'ez (Amharic)
Resources
3. NLP on Apache Spark
Parallelism, Concurrency, Distributing Computation
Parallelization Before Apache Hadoop
MapReduce and Apache Hadoop
Apache Spark
Architecture of Apache Spark
Physical Architecture
Logical Architecture
Spark SQL and Spark MLlib
Transformers
Estimators and Models
Evaluators
NLP Libraries
Functionality Libraries
Annotation Libraries
NLP in Other Libraries
Spark NLP
Annotation Library
Stages
Pretrained Pipelines
Finisher
Exercises: Build a Topic Model
Resources
4. Deep Learning Basics
Gradient Descent
Backpropagation
Convolutional Neural Networks
Filters
Pooling
Recurrent Neural Networks
Backpropagation Through Time
Elman Nets
LSTMs
Exercise 1
Exercise 2
Resources
Part II. Building Blocks
5. Processing Words
6. Information Retrieval
7. Classification and Regression
8. Sequence Modeling with Keras
9. Information Extraction
10. Topic Modeling
11. Word Embeddings
Part III. Applications
12. Sentiment Analysis and Emotion Detection
13. Building Knowledqe Bases
14. Search Engine
15. Chatbot
16. Object Character Recognition
Part IV. Building NLP Systems
17. Supporting Multiple Languages
18. Human Labeling
19. Productionizing NLP Applications
Glossary
Index
猜您喜欢

读书导航