技术教育社区
www.teccses.org

Spark NLP自然语言处理(影印版)

封面

作者:AlexThomas

页数:347

出版社:东南大学出版社

出版日期:2021

ISBN:9787564195113

电子书格式:pdf/epub/txt

内容简介

如果您想构建使用自然语言文本的企业级应用程序,但不确定从哪里开始或使用什么工具,那么本实用指南将帮助您入门。Wisecube的首席数据科学家Alex Thomas向软件工程师和数据科学家展示了如何使用深度学习和Apache Spark NLP库构建可扩展的自然语言处理(NLP)应用程序。通过具体的例子、实践和理论解释,以及在Spark processing框架上使用NLP的动手练习,这本书教你从基本语言学和写作系统到情绪分析和搜索引擎的一切。您还将探讨开发基于文本的应用程序的一些特殊问题,比如性能。

作者简介

亚历克斯·托马斯是Wisecube的首席数据科学家。他将自然语言处理和机器学习运用于临床数据、身份数据、雇主和求职者数据以及如今的生化数据。Alex从09版本开始使用Apache Spark,在工作中也用过包括UIMA和OpenNLP在内的多种NLP库和框架。

目录

Preface

Part I. Basics

1. Getting Started

Introduction

Other Tools

Setting Up Your Environment

Prerequisites

Starting Apache Spark

Checking Out the Code

Getting Familiar with Apache Spark

Starting Apache Spark with Spark NLP

Loading and Viewing Data in Apache Spark

Hello World with Spark NLP

2. Natural Language Basics

What Is Natural Language?

Origins of Language

Spoken Language Versus Written Language

Linguistics

Phonetics and Phonology

Morphology

Syntax

Semantics

Sociolinguistics: Dialects, Registers, and Other Varieties

Formality

Context

Pragmatics

Roman ]akobson

How To Use Pragmatics

Writing Systems

Origins

Alphabets

Abiads

Abugidas

Syllabaries

Logographs

Encodings

ASCII

Unicode

UTF-8

Exercises: Tokenizing

Tokenize English

Tokenize Greek

Tokenize Ge’ez (Amharic)

Resources

3. NLP on Apache Spark

Parallelism, Concurrency, Distributing Computation

Parallelization Before Apache Hadoop

MapReduce and Apache Hadoop

Apache Spark

Architecture of Apache Spark

Physical Architecture

Logical Architecture

Spark SQL and Spark MLlib

Transformers

Estimators and Models

Evaluators

NLP Libraries

Functionality Libraries

Annotation Libraries

NLP in Other Libraries

Spark NLP

Annotation Library

Stages

Pretrained Pipelines

Finisher

Exercises: Build a Topic Model

Resources

4. Deep Learning Basics

Gradient Descent

Backpropagation

Convolutional Neural Networks

Filters

Pooling

Recurrent Neural Networks

Backpropagation Through Time

Elman Nets

LSTMs

Exercise 1

Exercise 2

Resources

Part II. Building Blocks

5. Processing Words

6. Information Retrieval

7. Classification and Regression

8. Sequence Modeling with Keras

9. Information Extraction

10. Topic Modeling

11. Word Embeddings

Part III. Applications

12. Sentiment Analysis and Emotion Detection

13. Building Knowledqe Bases

14. Search Engine

15. Chatbot

16. Object Character Recognition

Part IV. Building NLP Systems

17. Supporting Multiple Languages

18. Human Labeling

19. Productionizing NLP Applications

Glossary

Index

下载地址

立即下载

(解压密码:www.teccses.org)

Article Title:《Spark NLP自然语言处理(影印版)》
Article link:https://www.teccses.org/1274430.html