 

Current Path：技术教育社区  程序设计  This

Transformers自然语言处理修订版(影印版)

Category: 程序设计

作者：LewisTunstall

页数：405

出版社：东南大学出版社

出版日期：2023

ISBN：9787576605891

电子书格式：pdf/epub/txt

内容简介

自2017年推出以来，transformer已迅速成为在各种自然语言处理任务上实现很好结果的主导架构。如果你是一名数据科学家或程序员，这本实践用书，现已改为全彩印刷，将向你展示如何使用基于python的深度学习库Hugging Face Transformers训练和扩展这些大型模型。Transformers已经被用来撰写真实的新闻故事、改进Google搜索查询，甚至是创建会讲老套笑话的聊天机器人。在这本指南中，作者Lewis Tunstall、Leandro von Werra、Thomas Wolf是Hugging Face Transformers的创建者，他们通过实践方法来教你如何使用Transformer以及如何将其集成到你的应用中。你将快速学习可以由transformer帮助解决的各种任务。

作者简介

刘易斯·汤斯顿，Lewis Tunstall是Hugging Face的机器学习工程师。他目前的工作重点是为NLP社区开发工具并教人们如何有效地使用这些工具。

目录

Foreword

Preface

1. Hello Transformers

The Encoder-Decoder Framework

Attention Mechanisms

Transfer Learning in NLP

Hugging Face Transformers: Bridging the Gap

A Tour of Transformer Applications

Text Classification

Named Entity Recognition

Question Answering

Summarization

Translation

Text Generation

The Hugging Face Ecosystem

The Hugging Face Hub

Hugging Face Tokenizers

Hugging Face Datasets

Hugging Face Accelerate

Main Challenges with Transformers

Conclusion

2. Text Classification

The Dataset

A First Look at Hugging Face Datasets

From Datasets to DataFrames

Looking at the Class Distribution

How Long Are Our Tweets?

From Text to Tokens

Character Tokenization

Word Tokenization

Subword Tokenization

Tokenizing the Whole Dataset

Training a Text Classifier

Transformers as Feature Extractors

Fine-Tuning Transformers

Conclusion

3. Transformer Anatomy

The Transformer Architecture

The Encoder

Self-Attention

The Feed-Forward Layer

Adding Layer Normalization

Positional Embeddings

Adding a Classification Head

The Decoder

Meet the Transformers

The Transformer Tree of Life

The Encoder Branch

The Decoder Branch

The Encoder-Decoder Branch

Conclusion

4. Multilingual Named Entity Recognition

The Dataset

Multilingual Transformers

A Closer Look at Tokenization

The Tokenizer Pipeline

The SentencePiece Tokenizer

Transformers for Named Entity Recognition

The Anatomy of the Transformers Model Class

Bodies and Heads

Creating a Custom Model for Token Classification

Loading a Custom Model

Tokenizing Texts for NER

Performance Measures

Fine-Tuning XLM-RoBERTa

Error Analysis

Cross-Lingual Transfer

When Does Zero-Shot Transfer Make Sense?

Fine-Tuning on Multiple Languages at Once

Interacting with Model Widgets

Conclusion

5. Text Generation

The Challenge with Generating Coherent Text

Greedy Search Decoding

Beam Search Decoding

Sampling Methods

Top-k and Nucleus Sampling

Which Decoding Method Is Best?

Conclusion

6. Summarization

The CNN/DailyMail Dataset

Text Summarization Pipelines

Summarization Baseline

GPT-2

T5

BART

PEGASUS

Comparing Different Summaries

Measuring the Quality of Generated Text

BLEU

ROUGE

Evaluating PEGASUS on the CNN/DailyMail Dataset

Training a Summarization Model

Evaluating PEGASUS on SAMSum

Fine-Tuning PEGASUS

Generating Dialogue Summaries

Conclusion

7. Question Answering

Building a Review-Based QA System

The Dataset

Extracting Answers from Text

Using Haystack to Build a QA Pipeline

Improving Our QA Pipeline

Evaluating the Retriever

Evaluating the Reader

Domain Adaptation

Evaluating the Whole QA Pipeline

Going Beyond Extractive QA

Conclusion

8. Making Transformers Efficient in Production

Intent Detection as a Case Study

Creating a Performance Benchmark

Making Models Smaller via Knowledge Distillation

Knowledge Distillation for Fine-Tuning

Knowledge Distillation for Pretraining

Creating a Knowledge Distillation Trainer

Choosing a Good Student Initialization

Finding Good Hyperparameters with Optuna

Benchmarking Our Distilled Model

Making Models Faster with Quantization

Benchmarking Our Quantized Model

Optimizing Inference with ONNX and the ONNX Runtime

Making Models Sparser with Weight Pruning

Sparsity in Deep Neural Networks

Weight Pruning Methods

Conclusion

9. Dealing with Few to No Labels

Building a GitHub Issues Tagger

Getting the Data

Preparing the Data

Creating Training Sets

Creating Training Slices

Implementing a Naive Bayesline

Working with No Labeled Data

Working with a Few Labels

Data Augmentation

Using Embeddings as

下载地址

（解压密码：www.teccses.org）

Article Title:《Transformers自然语言处理修订版(影印版)》
Article link:https://www.teccses.org/1462605.html

Share To

Recommended For You

回顶部